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ABSTRACT 


Pattern  matching  in  trees  is  fundamental  to  a  variety  of  programming 
language  systems.  However,  progress  has  been  slow  in  satisfying  a  pressing  need 
for  general  purpose  pattern  matching  algorithms  that  are  efficient  in  both  time  and 
space.  We  offer  asymptotic  improvements  in  both  time  and  space  to  Chase’s 
bottom-up  algorithm  for  pattern  preprocessing.  A  preliminary  implementation  of 
our  algorithm  runs  ten  times  faster  than  Chase’s  implementation  on  the  hardest 
problem  instances.  Our  preprocessing  algorithm  has  the  advantage  of  being  on¬ 
line  with  respect  to  pattern  additions  and  deletions.  It  also  adapts  to  favorable 
input  instances,  and  on  Hoffmann  and  O’Donnell’s  class  of  Simple  Patterns,  it  per¬ 
forms  better  than  their  special  puipose  algorithm  tailored  to  this  class.  We  show 
how  to  modify  our  algorithm  using  a  new  decomposition  method  to  obtain  a 
space/time  tradeoff.  Finally,  we  trade  a  log  factor  in  time  for  a  1  inear  space 
bottom- up  pattern  matching  algorithm  that  handles  a  wide  subclass  of  Hoffmann 
and  O’Donnell’s  Simple  Patterns. 


1.  Introduction 

Pattern  Matching  in  trees  is  fundamental  to  term  rewriting  systems  [21],  transformational 
programming  systems  [4, 15, 18, 26, 30, 35],  program  editing  and  development  systems  [10, 23, 32], 
code  generator  generators  [14, 17, 19,29],  theorem  provers  [24],  logic  programming  optimizers  that 
attempt  to  replace  unification  with  matching  [27],  and  compilers  for  functional  languages  such  as 
ML  [34],  and  Haskell  [22]  that  have  equational  function  definitions. 
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This  paper  describes  new  solutions  to  a  simple,  basic  kind  of  pattern  matching  problem  of 
wide  application.  The  problem  is  specified  formally  in  terms  of  a  partially  ordered  pattern 
language.  Given  an  alphabet  E=F  u  { v }  with  one  distinguished  variable  v  and  a  finite  set  F  of 
function  symbols,  where  each  such  symbol  fe  F  has  arity  A(f),  then  the  linear  pattern  language  for 
£  is  the  smallest  set  of  terms  that  include  (i)  v,  (ii)  constant  c  if  c  is  a  function  symbol  with  arity  0, 
and  (iii)  f  {p\,  ...,  pfi),  which  we  call  an  /-pattern,  if  /is  a  function  symbol  of  arity  k>0  and  its 
arguments  p  \ ,  ...,  pk  are  patterns  in  the  language. 

The  set  of  subpatterns  sub{p )  of  a  pattern  p  is  the  smallest  set  that  contains  p,  and,  if  p  is  an 
/-pattern  with  A(f)  >  0,  then  it  also  contains  the  subpatterns  of  the  arguments  of  p.  If  q  and  p  are 
two  different  patterns  and  q  is  a  subpattern  of  p,  then  p  is  said  to  properly  enclose  q.  The  size  of  a 
pattern  p  is  the  number  of  occurrences  of  alphabet  symbols  in  p. 

Lineal-  pattern  matching  is  defined  as  follows.  Pattern  p  \  is  said  to  be  more  general  than  pat¬ 
tern  p2,  denoted  by  p  ,  >p2,  iff  either  (i)  p  |  is  v,  or  (ii)  p  j  is  fix  \ ,  ...,  xk  ),  p2  is  /  ()’  i ,  ...,  yfi)  and 

Xj  >yj  for  /  =  1 . k.  If  p{  >p2,  we  also  say  that  p  |  matches  p2  or  that  \p{,  p2]  is  a  match.  A 

subsumption  dag  for  a  set  of  patterns  P  is  a  directed  acyclic  graph  that  represents  the  reflexive  tran¬ 
sitive  reduction  of  the  partial  ordering  (P>).  See  the  example  illustrated  in  Fig.  1,  where  a  is  a 
constant  and /is  a  binary  function  symbol. 


f(a,  a) 

Fig.  1  Subsumption  Dag 


By  the  preceding  definition  variable  v  serves  as  a  place  holder  during  matching.  Thus,  testing 
whether  pattern  p  matches  pattern  q  is  equivalent  to  testing  whether  q  can  be  formed  from  p  by 
replacing  occurrences  of  v  in  p  by  patterns,  each  of  which  may  be  different. 

In  order  to  gauge  performance  of  different  pattern  matching  algorithms,  it  is  useful  to  con¬ 
sider  the  following  basic  problem: 

Multi-Pattern  Matching  Problem:  Given  a  finite  set  P  of  patterns  and  a  pattern  t  called  the 
subject,  find  the  set  MPTM (P,t)  =  {\p,  q\.  p  e  P,  q  e  sub{t)  I  p  >  q}  of  all  patterns  in  P  match¬ 
ing  subpatterns  of  t. 
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This  paper  is  concerned  with  linear  pattern  matching  and  with  solutions  to  the  Multi-Pattern 
Matching  Problem  on  a  uniform  cost  sequential  RAM  [1,28].  More  complex  kinds  of  pattern 
matching  can  be  solved  by  extensions  to  our  algorithms. 

However,  even  for  linear  pattern  matching,  solving  MPTM  {Pj)  efficiently  seems  to  be 
extremely  difficult.  The  current  best  space-efficient  top-down  algorithm  to  solve  MPTM(Pj), 
where  P  contains  a  single  pattern  of  size  I  and  subject  t  is  of  size  n,  takes  0(n  V/  polylog(Z))  time,  a 
recent  result  due  to  Dubiner,  Galil,  and  Magen[12],  which  improves  Kosaraju's  earlier 
0(n  /-75polylog(/))  time  bound  [25], 

Bottom-up  pattern  matching  seems  to  be  even  more  difficult  than  top-down  matching  and  is 
of  special  practical  importance.  In  a  seminal  paper  [20]  Hoffmann  and  O’Donnell  presented 
bottom-up  linear  pattern  matching  algorithms  to  solve  MPTMP(t )  for  fixed  P  and  subjects  t  without 
variable  occurrences.  They  broke  up  the  problem  into  two  parts  -  (1)  preprocessing  P,  and  (2)  solv¬ 
ing  MPTMp(t).  Their  bottom-up  solution  to  MPTMP{t)  was  further  broken  up  into  repeated  solu¬ 
tions  to  the  following  subproblem: 

Bottom-Up  Subproblem:  Given  solutions  to  MPTMP(tj )  i  =  1,  ...,  k,  solve  MPTM P(  f(t  \ ,  ..., 

**))• 

Of  course,  an  efficient  solution  to  the  Bottom-Up  Subproblem  is  important  to  bottom-up  tree 
rewriting,  an  application  that  concerned  Hoffmann  and  O’Donnell.  They  sacrificed  time  and  space 
in  preprocessing  P  in  return  for  an  0(k)  time  solution  to  the  Bottom-Up  Subproblem  (not  counting 
the  time  to  produce  output).  Consequently,  they  obtained  a  O(n+o)  time  solution  to  MPTMP(t), 
where  o  is  the  number  of  pairs  in  MPTMP(t),  and  n  is  the  size  of  t.  However,  auxiliary  space  dur¬ 
ing  computation  of  MPTMP{t)  was  excessive  [20]  both  in  theory  and  in  practice  (see  Chase’s 
empirical  data  [7]). 

Hoffmann  and  O’Donnell’s  work  has  stimulated  a  number  of  papers  offering  heuristic  space 
improvements  [2,3,7,31],  and  Chase’s  method  has  aroused  considerable  attention  [7],  However, 
none  of  these  papers  gave  proofs  of  theoretical  improvements  or  promising  space/time  tradeoffs. 

In  this  paper  we  present  three  new  theoretical  results  in  bottom-up  linear  pattern  matching. 

1.  At  the  end  of  his  CAAP  ’88  paper  [3]  Burghardt  called  for  an  efficient  algorithm  for 
preprocessing  patterns  P  on-line  with  respect  to  additions  and  deletions  of  patterns.  Such  an  algo¬ 
rithm  is  needed  in  the  RAPTS  transformational  programming  system  [4],  because  incrementally 
modifying  systems  of  rewrite  rules  is  a  frequent  activity,  and  preprocessing  full  sets  of  patterns  is 
highly  expensive. 

In  this  paper  we  present  an  efficient  pattern  preprocessing  algorithm  that  builds  the  data  struc¬ 
tures  used  in  Chase’s  pattern  matching  algorithm  in  a  new  way.  Our  algorithm  implements  these 
data  structures  on-line  with  respect  to  additions  and  deletions  of  patterns.  When  our  algorithm  is 
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applied  repeatedly  to  solve  batch  preprocessing  by  adding  one  pattern  at  a  time  stalling  from  the 
empty  set,  it  runs  asymptotically  better  in  time  and  space  than  Chase’s  batch  algorithm. 

2  l  (k  +1) 

Hoffmann  and  O’Donnell  obtained  a  worst  case  time  bound  of  0(1  2  max  )  for  preprocess- 

/  k 

ing  P  and  a  worst  case  auxiliary  space  bound  of  0(1 2  max)  both  for  preprocessing  P  and  for  com¬ 
puting  MPTMP(t),  where  kmax  is  the  greatest  arity  of  any  function  symbol  appearing  in  P.  Based  on 

/  k 

our  coarse  analysis,  Chase’s  algorithm  improved  these  bounds  to  0(1  kmax  2  max)  time  for  prepro¬ 
cessing  P,  0((kmax  +  2kmax)  21  k"mx)  space  for  preprocessing  P,  and  0(kmax  2lkmax)  space  for  com¬ 
puting  MPTMP(t).  Based  on  the  same  parameterization,  our  algorithm  has  the  same  space  bounds 

/  k 

as  Chase  but  an  improved  0(1  2  max )  time  bound  for  preprocessing  P.  Based  on  a  more  accurate 
parameterization  and  deeper  analysis  of  the  problem,  our  algorithm  can  be  observed  to  have  a  more 
striking  theoretical  advantage  over  Chase’s  algorithm. 

Hoffmann  and  O’Donnell  presented  a  special  puipose  algorithm  tailored  to  the  class  of  Sim¬ 
ple  patterns  with  polynomial  worst  case  preprocessing  time  and  space.  Our  algorithm  adapts  to 
input  instances  in  this  class  and  performs  better  in  both  worst  case  asymptotic  time  and  space  than 
their  special  puipose  batch  algorithm. 

A  prototype  implementation  of  our  algorithm  is  currently  being  used  in  the  RAPTS  transfor¬ 
mational  programming  system  [6]  as  the  basis  for  searching,  conditional  rewriting,  and  static 
semantic  analysis.  A  preliminary  C  implementation  of  our  algorithm  outperforms  Chase’s  imple¬ 
mentation  of  his  algorithm  on  same  data,  machine,  and  compiler  [7];  on  the  hardest  problem 
instances  we  obtain  a  ten-fold  speedup.  We  believe  that  a  more  careful  implementation  of  our 
algorithm  would  show  a  more  dramatic  improvement. 

2.  Our  first  result  is  modified  to  obtain  a  general  space/time  tradeoff.  Roughly  speaking,  for 

’■y 

parameter  q>  1,  we  trade  0(q~)  in  time  to  solve  the  Bottom-Up  Subproblem  in  return  for  auxiliary 
space  0(lkmaxq22l/«+q2lkmax/q). 

3.  In  bottom-up  pattern  matching,  the  main  difficulty  that  sorely  needs  to  be  overcome  is 
space  utilization.  We  present  an  algorithm  for  a  subclass  of  Hoffmann  and  O’Donnell’s  Simple 
Patterns  that  runs  in  0(1)  space  and  0(log  /)  time  to  solve  the  Bottom-Up  Subproblem.  A  theoreti¬ 
cal  improvement  to  Ofloglog  /)  time  for  the  Bottom-Up  Subproblem  is  obtained  using  Dietz’s  per¬ 
sistent  form[8]  of  the  Van  Emde  Boas  priority  queue[37].  Previous  bounds  due  to  Hoffmann  and 

O’Donnell  are  0(1 2)  time  and  space  for  an  algorithm  tailored  to  binary  Simple  Patterns  (which  our 

k  +1 

subclass  properly  includes)  and  0(1  max  )  space  with  O(k)  Subproblem  time  for  an  algorithm  han¬ 
dling  all  Simple  Patterns.  Thus,  we  offer  a  quadratic  space  improvement  over  the  latter  algorithm 
for  binary  patterns  and  even  more  dramatic  improvement  for  patterns  of  greater  arity.  Our  space 
compression  is  obtained  by  applying  persistent  data  structures  in  a  new  way. 
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This  paper  is  organized  as  follows.  In  the  next  section  we  discuss  Hoffmann  and  O’Donnell’s 
and  Chase’s  solutions  to  multi-pattern  matching.  After  that  we  present  our  on-line  preprocessing 
algorithm,  its  adaptation  to  Simple  Patterns,  handling  deletions,  and  a  general  space/time  tradeoff. 
In  the  final  section  we  present  our  third  result,  which  deviates  significantly  from  the  earlier  stra¬ 
tegics  of  either  Hoffmann  and  O’Donnell  or  Chase. 

2.  Algorithms  for  Bottom-up  pattern  matching 

2.1.  Notation 

In  addition  to  standard  mathematical  notation  it  will  sometimes  be  convenient  to  use  certain 
unconventional  terminology.  We  let  expression  A  with  x  abbreviate  set  element  addition  A  u  {x} 
(where  in  this  context  A  is  interpreted  as  the  empty  set  if  it  is  undefined).  Likewise,  A  less  x 
represents  set  element  deletion  A  -  {x}.  If /is  a  binary  relation,  then  domain /  =  {x:  [x,y]  e  /}, 
range /=  \y:  |x,v]  e  /},  and  f~x  denotes  the  inverse  map  {[y,x]:  [x,y]  e  /}.  Also,  fix)  denotes 
function  application  (undefined  if  /is  multi-valued  at  x  or  if  x  4  domain  /),  /{x}  denotes  multi¬ 
valued  map  application  with  value  {y\  [x,y]  e  /},  and  /|.S’]  denotes  the  image  of  set  5  under /with 
value  {y:  [x,v]  e  / 1  x  e  5}.  The  number  of  elements  in  a  finite  set  5  is  denoted  by  151.  If  /is  a 
binary  relation  (perhaps  a  function),  then  the  number  of  pairs  in  its  graph  representation  is  denoted 
by  I/I.  If  op  is  any  binary,  associative,  and  commutative  operator,  and  5=  [x\ ,  ...,  x/()  is  a  set,  then 
the  APL-like  reduction  notation  op  IS  denotes  expansion  X\  op  ...  op  x/(  with  an  arbitrary  ordering  of 
arguments.  For  example,  u/5  =  u  T.  If  5  is  a  set,  we  use  the  for-loop  notation  for  xe5  loop 

TeS 

block  (x)  end  to  execute  block  (x)  repeatedly  for  each  value  xe5  without  repetition.  Finally, 
assignment  A  op  :=  x  is  used  to  abbreviate  A  :=  A  op  x. 

2.2.  Hoffmann  and  O’Donnell’s  Bottom-Up  Algorithm 

Bottom-up  solutions  presented  by  Hoffmann  and  O’Donnell  and  Chase  treat  the  set  P  of  pat¬ 
terns  as  fixed  and  the  subject  t  (which  for  them  has  no  variables)  as  the  only  parameter  that  can 
vary.  In  a  bottom-up  strategy  to  solve  the  Multi-Pattern  Matching  Problem,  a  complete  set 
MPTMpiq)  of  matches  is  found  for  each  subpattern  q  of  t  without  reference  to  any  subpattern  of  t 
that  properly  encloses  q. 

Hoffmann  and  O’Donnell  explain  their  multi-pattern  matching  algorithm  in  terms  of  the  fol¬ 
lowing  two  notions.  If  P  is  a  set  of  patterns,  then  the  pattern  forest  PF  of  P  is  the  set  of  subpat¬ 
terns  of  all  the  patterns  in  P.  If  PF  is  the  pattern  forest  for  a  set  P  of  patterns  and  t  is  the  subject, 
then  the  match  set  MS  ( t )  for  t  is  defined  by  the  rule  MS  (t)  =  [q  e  PF  I  q  >  t}. 
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Hoffmann  and  O’Donnell  use  an  equivalent  recursive  definition  of  match  sets  (but  restricted 
to  subjects  without  variable  occurrences)  to  obtain  an  efficient  bottom-up  algorithm.  The  recursive 
rules  shown  below  add  a  new  rule  for  MS(v )  to  Hoffmann  and  O’Donnell’s  rules  so  that  match  sets 
can  be  defined  for  arbitrary  patterns. 

MS(v)  =  {v} 

MS(c )  =  {v},  when  constant  c  4  PF 
{v,c},  when  constant  c  e  PF 

(1)  MS (f(t i,  ...,  tk))={f(q i,  ...,  qk)  e  PF  I  qt  e  MS(tt),i  =  1,  ....  k}  u  M 

Surprisingly,  this  new  rule  is  merely  a  formalism,  since  it  gives  rise  to  the  exact  same  collection  of 
match  sets  as  derived  by  Hoffmann  and  O’Donnell.  This  is  true,  because  the  match  set  MS(p)  for 
an  arbitrary  pattern  p  is  identical  to  the  match  set  MS(t)  for  any  pattern  t  formed  from  p  by  replac¬ 
ing  occurrences  of  v  in  p  by  occurrences  of  arbitrary  constants  that  do  not  belong  to  PF. 

After  determining  match  sets  for  constants  and  variable  occurrences  in  subject  t,  Hoffmann 
and  O’Donnell’s  algorithm  solves  the  Bottom-Up  Subproblem  by  identifying  the  match  set  for 
each  subpattern /(t  j ,  ....  4)  of  t  based  on  the  match  sets  for  q,  i  =  1,  ....  k.  This  task,  which  we  call 
the  Bottom-Up  Step,  computes  expression  (1)  by  an  Oik)  time  lookup  in  a  k-dimensional  array  stor¬ 
ing  transition  map  Xy,  where  xfMS  (tf, ....  MS  (4))  =  MSf(t  i ,  ....  4  )). 

For  consistency,  throughout  this  paper  we  consider  an  instance  of  the  Multi-Pattern  Matching 
Problem  with  pattern  set  P,  pattern  forest  PF,  and  subject  t.  We  also  use  the  following  parameters: 
n  =  size  of  t 

r  =  the  set  of  match  sets  for  P 
l  =  I PF I 
o  =  I  MPTMpit)  I 

k  max  =  maximum  arity  of  any  function  symbol  appearing  in  PF 

In  order  to  compute  Step  (1)  and  print  the  set  MSf(t{,  ....  4))  n  P  of  patterns  that  match 
f  it],  ....  4)  in  time  Oik  +  I  MS  if  it  ] ,  ....  4))  n  P  I ),  Hoffmann  and  O’Donnell  preprocess  the  pat¬ 
terns  in  P  to 

i.  encode  each  pattern  in  PF  as  a  distinct  integer  from  1  to  l,  and  represent  patterns  as 
trees  in  the  obvious  way  (implemented  in  compressed  form  as  dags); 

ii.  compute  all  match  sets,  and  encode  each  such  set  as  a  distinct  integer  from  1  to  I F I ; 

iii.  compute  the  subset  of  patterns  in  P  belonging  to  the  ith  match  set  for  /  =  1, ...,  I F I ; 

iv.  compute  a  transition  map  Ty  for  every  k-ary  function  symbol  /  occurring  in  P  so  that 
Xy(MS(U),  ....  MS  (4))  =  MS  (fit  | ,  ...,  4));  xv  =  { v } ,  and  xc  =  {v,  c}  if  c  is  any  constant 
appealing  in  PF\  transition  maps  if  are  implemented  as  k-dimensional  arrays  accessed 
using  integer  encodings  of  match  sets. 

After  preprocessing  the  patterns  in  P,  Hoffmann  and  O’Donnell’s  algorithm  solves  the 
Multi-Pattern  Matching  Problem  by  repeatedly  solving  Step  (1)  from  innermost  to  outermost  sub¬ 
pattern  of  t.  Their  worst  case  time  is  0(n  +  o)  after  preprocessing  P.  The  array  storing  the  transi¬ 
tion  map  if  for  each  k-ary  function  symbol  /  appearing  in  PF  uses  Q(in*)  space,  where  the 
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number  IFI  of  match  sets  can  be  Q.(2l),  which  is  expensive  in  practice.  Their  rough  bound  on 
preprocessing  time  is  O  (I2  I T I  k'"ax+l). 

2.3.  Chase’s  Improvement 

Chase  was  able  to  improve  Hoffmann  and  O’Donnell’s  method  by  exploiting  the  deeper 
structure  of  the  pattern  set  P  to  reduce  the  size  of  transition  maps  [7],  Chase’s  heuristic  is  slower  by 
a  constant  factor  but  preserves  the  O  (k)  asymptotic  time  for  solving  the  Bottom-Up  Subproblem. 

Let  PF  be  the  pattern  forest  for  P,  and  assume  that  PF  contains  variable  v.  For  each  k- ary 
function  / appealing  in  PF  and  each  i  =  1,  ...,  k,  Chase  introduced  projection  IT,  =  { q-{ :  f  (q  i,  ....  qk) 
e  PF}  containing  the  set  of  patterns  appearing  as  the  ith  argument  of  some  /-pattern  in  PF.  Chase 
made  the  crucial  observation  that  identity  (1)  could  be  replaced  by 

(2)  MS(f(ti,...,tk))={f(q1,...,qk)ePF  I  qt  e  MS(t,)  n  IT},  i  =  1 . k]  u  {v} 

which  gives  rise  to  a  modified  Bottom-Up  Step  with  improved  auxiliary  space. 

Chase’s  Bottom-Up  Step  to  compute  (2)  involves  two  substeps.  First  a  conversion  map  py  is 
used  to  turn  each  Hoffmann  and  O’Donnell  match  set  MS (q)  into  a  Chase  match  set 
(ti))  =  MS (tj)  n  YLf  for  i  =  1.  ...,  k.  If  any  of  these  Chase  match  sets  are  empty,  then 
MS(f  (t  i,  ...,  4))  =  {v}.  Otherwise,  Chase’s  transition  map  0,  is  used  to  obtain  the  Hoffmann  and 
O’Donnell  match  set  0/py  (MS  (? , )),  ...,  \lk/(MS  (tk)))  =  MS(f  (t{,  ...,  4 )).  Chase’s  implementation 
uses  integer  encodings  for  both  kinds  of  match  sets,  one-dimensional  arrays  to  implement  each 
conversion  map  p/  and  a  k-dimensional  array  for  0^. 

A  straightforward  set  theoretic  argument  can  be  used  to  explain  why  Chase’s  transition  map 
utilizes  space  better  than  Hoffmann  and  O’Donnell’s.  Whenever  every  Chase  match  set  p|(A/.S'(7,)) 
is  nonempty  i  =  1,  ...,  k,  we  know  that  identity  0/(p|(A7.S’(t  i )),  ...,  \lj(MS  (tk)))  = 

X/M5 (t , ),  ....  MS (4))  holds.  Consequently,  if  pj  is  not  one-to-one  for  some  i,  we  know  that  10/  < 
lx/.  The  essential  idea  may  be  simply  put:  for  any  two  finite  functions  h  and  g  where  g  is  not  one- 
to-one  and  domain  h  c  range  g.  then  I/? I  <  l/?°gl. 

Chase  also  provided  extensive  empirical  evidence  to  show  that  0y  is  much  smaller  than  if  in 
practice.  Consider  the  example  in  Fig.  2.  The  Chase  match  sets  associated  with  the  first  com¬ 
ponent  of /are  c\  =  {1}  and  c2  =  {1.2};  the  Chase  match  sets  associated  with  the  second  com¬ 
ponent  of /are  d  {  =  { 1 }  and  d  2  =  {1.2}.  The  Chase  conversion  and  transition  maps  store  16  entries 
compared  with  36  entries  in  Hoffmann  and  O’Donnell’s  transition  map  Xy. 


PF:  v  a  f(a,v )  f(v,a )  /(v,v) 

Encoding:  1  2  3  4  5 


F: 

{1} 

{1.2} 

{1.3,5} 

{1,3, 4, 5} 

{1,4,5} 

{1.5} 

Encoding: 

m\ 

m  2 

m3 

m4 

m5 

m6 

Vf 

0/ 

d  i 

d 2 

m  i 

Cl 

m  i 

d  i 

Cl 

m6 

m5 

m2 

c  2 

m2 

d  2 

C  2 

m  3 

m4 

m  3 

Cl 

m3 

d  i 

m4 

Cl 

m4 

d  i 

m  5 

Cl 

m5 

d  i 

m6 

Cl 

m6 

d  i 

Fig.  2  Chase’s  Data  Organization 


3.  Incremental  preprocessing 

We  will  present  a  preprocessing  algorithm  that  incrementally  constructs  maps  p  and  0  and  is 
on-line  with  respect  to  modifications  to  P  by  adding  or  deleting  patterns.  When  used  to  solve  the 
batch  preprocessing  problem  for  fixed  P,  our  algorithm  performs  asymptotically  better  in  time  and 
space  than  Chase’s.  It  is  convenient  to  specify  our  algorithm  in  terms  of  two  abstract  datatypes. 

3.1.  Abstract  Sets 

The  first  abstract  datatype  is  called  a  Set  Encoding  Structure  (abbr.  SEJStructure),  which  is  a 
4-tuple  (U,  D ,  Q,  x)  with  finite  universe  U,  primary  set  D  <z  2U,  secondary  set  Q  c  U,  and  top  ele¬ 
ment  xe  U,  where  { x }  e  D,  and  every  set  within  D  contains  x.  For  simplicity  we  will  assume  for 
now  that  U  and  Q  are  fixed  in  order  to  focus  on  the  more  difficult  problem  of  updating  D.  Eater 
when  we  show  how  .S' /^structures  are  used  by  our  preprocessing  algorithm,  details  on  how  they  are 
initialized  and  how  to  update  U  and  Q  will  be  supplied.  Five  operations  on  .S /restructures  are 
described  below.  A  sixth  operation  deletion  will  be  described  later  in  a  separate  section. 

F  create:  Initialize  D.  This  operation  is  performed  only  once  for  an  ,S£_structure  before 
any  of  the  other  operations  below. 


P:={{x}} 
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2.  replace(d,z):  Replace  cl  e  D  by  new  set  cl  with  z,  where  z  e  U,  for  which  we  write, 

cl  with:=  z 

3.  add(d,z):  Add  new  set  cl  with  z  to  D,  where  cl  e  D  and  z  e  U;  that  is, 

D  with:=  d  with  z 

4.  query(d):  Retrieve  set  d  n  Q,  where  d  e  D. 

5.  index(c):  Retrieve  set  {cl  e  D  I  c  e  t/},  where  c  e  U. 

We  will  implement  S£_structures  using  a  data  structure  called  an  SE_ tree  (see  Fig.  3),  whose 
nodes  correspond  to  distinct  subsets  of  the  universe  U.  Each  set  dx  belonging  to  primary  set  D  is 
associated  with  a  node  x  in  the  S£_tree;  that  is,  x  ’encodes’  dx.  The  root  encodes  set  { x } .  Flowever, 

the  set  dx  associated  with  a  node  x  in  the  tree  may  not  ncccssai'ily  belong  to  D.  If  dx  does  not 

belong  to  D ,  it  is  called  a  gap.  If  dx  and  dy  are  sets  associated  with  tree  nodes  x  and  y,  then  x  is  a 
descendant  of  y  in  the  tree  only  if  dy  c  dx. 

.S'£_trccs  are  implemented  with  two  kinds  of  records  -  a  nodejccocd  for  each  node  in  the  tree 
and  a  t/_record  for  each  symbol  in  U.  We  will  sometimes  avoid  distinguishing  a  node  from  its 
node_ record  implementation.  The  node  jecord  for  node  x  contains  five  fields:  1.  a  D  field  con¬ 
taining  1  if  the  node  is  not  a  gap  and  0  if  it  is,  2.  a  sibling  field  with  a  pointer  to  the  right  sibling  of 
x,  3.  a  succ  field  with  a  pointer  to  the  leftmost  child  of  x,  4.  a  Q_query  field  storing  a  possibly 
empty  subset  of  Q ,  and  5.  a  Q_ancestor  field  with  a  pointer  to  the  nearest  ancestor  in  the  tree  with  a 
nonempty  Q_query. 

For  each  node  x  the  value  of  the  subset  of  Q  stored  in  the  Q_query  field  is  denoted  by 
Q_query  (x).  The  set  is  implemented  by  a  pointer  to  a  list  of  pointers  to  t/_records  for  each  symbol 
in  Qjquery  (x).  If  dx  represents  the  set  associated  with  node  x,  then  the  value  of  the  collection  of 
sets  Qjquery  (y)  for  nodes  y  along  the  path  from  x  to  the  root  are  mutually  disjoint,  and  their  union 
has  the  value  query  (x)  =  dx  n  Q. 

The  £/_record  for  symbol  c  has  three  fields:  1.  a  U  field  containing  symbol  c,  2.  a  Q  field 
with  a  bit  indicating  whether  c  belongs  to  Q,  and  3.  a  Djndex  field  storing  the  subset  of  tree  nodes 
x  closest  to  the  root  such  that  the  associated  set  dx  contains  c. 

We  denote  the  subset  of  nodes  associated  with  the  Djndex  field  in  the  f/_record  for  symbol  c 
by  Djndex  (c).  It  is  implemented  by  a  pointer  to  a  list  of  pointers  to  node_ records  for  each  node 
in  Djndex  (c).  Thus,  the  set  of  tree  descendents  of  nodes  belonging  to  Djndex  (c)  has  the  value 
computed  by  operation  index  (c)  =  {d  e  D  \  c  e  d] . 

Fig.  4  illustrates  how  SFjxees  compress  the  space  needed  to  store  match  sets.  Chase’s  algo¬ 
rithm  stores  fifteen  pattern  entries  to  represent  the  collection  of  match  sets  T  in  the  example  shown 
in  Fig.  2;  our  algorithm  stores  these  same  match  sets  in  an  SF_tree  using  only  nine  pattern  entries. 


-  10- 


Fig.  4  SZJ-tree  for  ( PF ,  T,  v) 

The  create  operation  D  :=  { { x } }  is  implemented  by  adding  a  new  tree  root  with  empty 
sibling,  succ,  and  Q_ancestor  fields,  D  bit  on  ,  and  Q_query  containing  X  if  xe  Q  and  empty  if  not. 
Within  the  t/_record  for  x  we  initialize  DJndex  to  a  singleton  set  containing  the  newly  created 
root. 

Implementing  the  replace  operation  d  with:=  z  has  two  cases.  In  the  first  case,  called  a 
nondestructive  replace,  the  tree  node  x  associated  with  d  is  not  a  leaf  (i.e.  succ  is  nonempty).  In 
this  case  (i)  unset  the  D  bit  in  x  (which  makes  x  a  gap),  and  create  a  new  tree  node  y  as  a  child  of  x, 
(ii)  if  Q_query  (x)  is  nonempty,  then  make  the  Q  ancestor  in  y  point  to  x;  otherwise,  make  it  point 
to  the  same  record  that  the  Q_ancestor  in  x  points  to,  and  (iii)  set  the  D  bit  in  y.  In  the  second  case, 
where  x  is  a  leaf,  we  reuse  x  to  represent  the  new  set  d  with  z.  In  this  case,  called  a 
destructive  replace,  we  assume  that  nodes  x  and  y  are  the  same.  In  either  case,  if  z  belongs  to  Q, 
add  z  to  Q_query  (y).  Finally,  add  y  to  the  DJndex  (z). 
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To  implement  the  add  operation  D  with:=  d  with  z  we  let  x  be  the  tree  node  associated  with 
set  d.  Create  a  new  free  node  y  associated  with  set  d  with  z,  and  make  y  the  child  of  x.  If 
Q_query  (x)  is  nonempty,  then  make  the  Q  ancestor  in  y  point  to  x;  otherwise,  make  it  point  to  the 
same  record  that  the  Q_ancestor  in  x  points  to,  and  set  the  D  bit  in  y.  If  z  belongs  to  Q,  add  z  to 
Q_query  (y).  Finally,  add  y  to  DJndex  (z). 

The  query  operation  d  n  Q  is  implemented  as  follows.  If  x  is  the  free  node  associated  with  d, 

then  retrieve  the  elements  in  each  Q_query  set  along  the  path  starting  from  x  following 

Qjmcestors.  Recall  that  the  Q_query  sets  along  this  path  are  disjoint. 

Finally,  SE_ Pees  support  a  straightforward  implementation  of  the  index  operation 

{d  e  D  I  c  e  dj.  Form  a  list  of  records  x  (where  set  dx  belongs  to  D)  occurring  in  subtrees  rooted 
in  nodes  belonging  to  DJndex  (c). 

In  order  to  analyze  the  complexity  of  SE_ Pees,  we  give  the  following  definitions.  For  each 
node  x  in  an  S£_tree,  define  path(x )  to  be  the  set  of  nodes  in  the  tree  path  from  the  root  to  x. 
Define  weight (x)  to  be  the  number  of  elements  u  e  U  such  that  DJndex (u)  contains  x.  Define 
wn(D)  =  Yj  weight  (x)  to  be  the  total  weight  of  all  the  nodes  in  the  tree  that  implements  set 

x  is  a  tree  node 

D.  Letting  des{x)  denote  the  number  of  tree  descendants  of  x,  we  can  define 

wp(D)  =  Y  des  (x)xweight  (x)  to  be  the  sum  of  the  weights  of  every  tree  path.  Clearly, 

x  is  a  tree  node 

I D  I  <wn(D)  <wp(D)  <  2  Y  \d  \ .  Usually,  wn(D)  is  much  smaller  than  wp(D). 

deD 

Lemma  1. 

1.  If  DJndex  (c)  is  nonempty  for  every  c  e  U,  then  the  total  space  required  by  an  SEjree  to 
implement  an  SE_structure  ( UJ),Q,X)  is  0(wn(D)).  (Note  that  a  naive  representation  of  D  can 
require  0(wp(D))  space.) 

2.  Operations  create ,  replace,  and  add  each  take  unit  time  and  space.  A  sequence  of  j  of 
these  operations  requires  &(j)  space  in  the  worst  case. 

3.  Operation  query  cl  r\Q  takes  0(\d  n  Q\)  time. 

4.  Operation  index  {d  e  D  I  c  e  df  takes  0(\{d  e  D\c  e  dj\)  time. 

Proof 

1.  The  total  space  required  by  an  SEjree  is  dominated  by  the  space  0(wn(D))  needed  to 
store  all  of  the  DJndex  sets. 

2. -3.  Trivial. 

4.  Within  every  subtree  of  an  SEjree  the  number  of  gaps  is  less  than  the  number  of  nodes 
that  are  not  gaps.  This  follows  from  the  fact  that  only  a  nondestructive  replace  can  create  a  gap, 
and  this  gap  always  has  at  least  two  children.  Thus,  no  leaf  can  be  a  gap,  and  there  are  more  leaves 
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than  internal  nodes  with  at  least  two  children.  ■ 

We  will  consider  useful  valiants  of  S£_structures  that  require  minor  alteration  to  the  preced¬ 
ing  implementation  and  do  not  affect  the  stated  complexities.  A  Simple  S£_structure  is  one  with 
no  secondary  set.  A  numeric  .S' £_s ti  ncture  is  one  in  which  the  set  elements  in  the  primary  set  D  are 
identified  by  natural  numbers  1.  ...,  \D\  (cf  Fig.  5).  Numeric  S£_structures  have  special  importance 
in  connection  with  our  second  abstract  datatype  described  next. 

3.2.  Abstract  Maps 

The  second  abstract  datatype  used  in  our  pattern  matching  algorithm  is  the  SE_map,  which  is 
a  partial  function  /:  D-^R  from  a  domain  set  D  to  a  range  set  R,  where  D  and  R  are  the  primary  sets 
of  two  .S£_structures.  Let  x  be  the  top  element  of  R’s  S£_structure,  so  that  { x }  e  R.  It  is  con¬ 
venient  to  postpone  saying  how  /  is  initialized  until  later,  and  focus  on  the  following  two  map 
operations: 

1.  modify  range(A,z):  Given  a  set  A  and  an  element  z,  where  AcD,  and  z  does  not  belong 
to  any  set  in  R,  add  z  to  /(x)  for  each  x  belonging  to  A.  This  operation  can  modify 
.S'£_sct  R  as  well  as  map  /.  It  is  denoted  by, 

for  x  g  A  loop 
if  a-  4  domain /then 

Ax)  ■■=  fx} 

end 

Jlx)  with:=  z 
end 

2.  modify  domain(x,z ):  Given  a  set  x  in  the  domain  of  /,  where  (x  with  z)  belongs  to  D  but 
not  to  the  domain  of  f  map  the  new  set  x  with  z  under  /  to  the  old  image  /  (x).  This 
operation  modifies  /but  not  5f?_sets  D  or  R.  It  is  denoted  by, 

f(x  with  z)  :=  Ax) 

Our  basic  implementation  of  S£_maps  /:  £)—>£!  uses  S£_tree  implementations  for  D  and  R  as 
described  above.  In  addition,  whenever  f(d)  =  r,  if  x  and  y  are  the  node_records  associated  with 
sets  d  and  r.  then  in  addition  to  the  node_ record  fields  previously  described,  x  also  stores  a  pointer 
to  y,  and  y  also  stores  the  size  of  the  preimage  set  f~ 1  { r } . 

To  implement  modify  range(A,z),  we  assume  that  the  sets  belonging  to  A  are  represented  by  a 
linked  list  of  nodes  in  the  S£_tree.  In  a  single  scan  through  A,  we  compute  the  subset  A!  of  nodes 
that  do  not  belong  to  the  domain  of  /.  For  each  node  x  e  A  ] ,  we  store  a  pointer  in  x  to  the  node  y 
associated  with  { x }  e  R,  and  increment  the  preimage  count  in  y.  Next,  in  a  second  scan  through  A, 
we  form  buckets  An/  {yj  and  bucket-counts  for  each  y  e  f[ A].  This  allows  us  to  process  the 


-  13  - 


elements  of  A  efficiently,  and  to  modify  SE_ set  R  according  to  two  different  cases.  (1)  For  each 
range  element  y  e  /[A]  whose  preimage  is  entirely  contained  in  A  (which  occurs  when  the  bucket- 
count  for  y  equals  the  preimage  count  for  y),  we  execute  a  replace  operation  y  with:=  z  on  SE_set 
R.  (2)  For  each  element  y  e  /[A]  not  handled  in  case  (1),  we  execute  an  add  operation  R  with:=  y 
with  z,  relink  each  element  in  A  n  f~l  {y }  to  the  new  set  y  with  z,  and  modify  preimage  counts. 

The  modify  domain(x,z)  operation  is  only  executed  immediately  after  a  set  x  in  the  domain  of 
/is  modified  by  either  operation  add(x,z )  or  replace(x,z).  The  implementation  is  different  in  each 
of  these  two  cases.  If  x  is  modified  by  replace(x,z),  then  deleting  x  from  D  implicitly  removes  x 
from  the  domain  of/.  Flence,  in  this  case,  which  we  call  an  implicit  modify  domain,  the  implemen¬ 
tation  is  vacuous.  Flowever,  if  x  is  modified  by  add(x,z),  then  we  need  to  explicitly  modify  /  by 
linking  the  new  domain  element  x  with  z  to  the  old  range  element  fix)  and  increment  the  preimage 
reference  count. 

Analysis  of  the  preceding  implementation  of  SF_maps  is  straightforward  and  follows 
immediately  from  Lemma  1 

Lemma  2. 

1.  The  time  to  execute  modify  range  is  0(\A\). 

2.  Implicit  modify  domain  operations  cost  nothing.  A  modify  domain  operation  that  is  not 
implicit  takes  0(1)  time. 

If  D  is  the  primary  set  of  a  numeric  .S' £_s  tincture,  it  is  sometimes  useful  to  implement 
domain /as  an  array,  accessed  using  the  numeric  code  of  a  D  element  as  shown  in  Fig.  5.  This  idea 

is  extended  to  multi-dimensional  arrays  used  to  implement  the  domain  of  a  multi-dimensional 

A(f) 

5F_map  /:  x  D,  — >/?,  where,  for  i=l,..,A(f),  Dj  is  the  primary  set  of  an  5F_structure.  In  this 

i—\ 

case,  where  /  has  arity  k  >  1,  we  include  a  dimension  parameter  i  in  operation  modify 
domainfixi ,...,Xfc],  z)  to  map /under  [x  |  ,..,x,  with  z,..,xk]  to  the  old  image  / (x  \,...,xk)-  We  also 
assume  a  precondition  that  [xj,...,x^]  e  domain /,  |x ,  ,..,x,  with  z,..,x^]  4  domain/  andx,  with  z 
e  Dp 

The  preceding  algorithms  adapt  readily  to  these  array  implementations.  Flowever,  since  the 
domains  of  5F_maps  can  be  augmented,  we  must  account  for  overhead  costs  in  maintaining  these 
arrays  dynamically.  We  implement  dynamic  multi-dimensional  arrays  by  generalizing  the  method 
of  unit-time  array  initialization  found  in  the  solution  to  exercise  2.12  of  Aho,  Flopcroft,  and 
Ullman’s  book[l].  Their  method  permits  a  one-dimensional  array  of  size  s  to  double  its  size  in  unit 
time  if  growth  space  exists.  If  there  is  no  growth  space,  we  can  initialize  a  new  array  of  size  2s  in 
unit  time  and  then  copy  the  old  array  into  the  new  array  in  s  steps.  A  multi-dimensional  array  that 
needs  to  double  the  size  of  one  of  its  dimensions  can  be  reduced  to  the  one-dimensional  case. 
Flowever,  if  the  dimension  that  doubles  can  vary,  then  we  cannot  assume  that  growth  space  is  ever 
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available. 


Consider  a  ^-dimensional  amy  Q,  where  index  values  in  dimension  /  for  i=\ . k  range  from 

1  to  i'i ,  and  Q  is  filled  with  entries  (i.e.  from  the  5£_map  domain)  only  for  index  values  from  1  to 
Thus,  Q  has  size  r  |  x  ■  •  •  xrk  and  is  filled  with  e  ■  ■  ■  Xek  entries.  Consider  a  single  opera¬ 
tion  extend^,  which  is  implemented  by  the  following  two  steps: 

1.  If  et  =  r,,  then  reallocate  Q  with  double  the  range  of  the  ith  dimension;  i.e.,  assign  2r,  to  r,- . 

2.  Add  one  toe,. 


Consider  an  arbitrary  sequence  of  extend  operations  starting  from  an  initial  array  with 
e,=r,  =  1,  i=l,...,k.  The  overhead  in  executing  this  sequence  is  the  total  reallocation  cost  in  Step 
1 .  The  amortized  overhead  per  array  element  is  the  maximum  over  all  such  sequences  s  of  the  over¬ 
head  for  s  divided  by  the  number  of  elements  in  the  array  after  s  is  executed. 


LEMMA  3.  The  amortized  overhead  per  array  element  in  a  k-dimensional  array  due  to  exe¬ 
cuting  an  arbitrary  sequence  of  extend  operations  starting  from  the  unit  array  is  &(k). 

Proof  Whenever  the  range  of  a  dimension  is  doubled  in  Step  1  of  an  extend  operation,  we 
need  to  allocate  twice  the  space  of  the  current  array  (a  unit-time  operation  by  the  method  of  Aho, 
Hopcroft,  and  Ullman)  and  to  copy  every  entry  in  the  old  array  into  the  new  array  (which  can  be 
done  in  time  proportional  to  the  number  of  entries  copied  by  using  strength  reduction  to  access  and 
copy  an  arr  ay  element  in  unit  time). 

Let  a  segment  be  a  maximal  contiguous  subsequence  of  a  sequence  of  extend  operations  in 
which  the  last  extend  in  the  subsequence  doubles  the  range  of  some  dimension,  but  no  other  extend 
involves  any  such  doubling.  Since  the  last  extend  operation  in  a  worst  case  sequence  must  double 
the  range  of  some  dimension,  we  limit  our  analysis  to  sequences  of  segments  instead  of  sequences 
of  extend  operations.  Let  f  be  the  number  of  entries  in  an  array  just  after  the  ith  segment  is  exe¬ 
cuted;  let  Cj  be  the  overhead  cost  due  exclusively  to  the  ith  segment.  Clearly,  c,  <  f .  Since  dou¬ 
bling  the  range  of  one  dimension  doubles  the  size  of  the  array,  the  array  size  after  execution  of  the 

r  j 

i™  segment  is  2'.  Hence,  we  also  know  that  c,-  <  2‘  1 .  Since  <?,•  >  — ,  j=l,...,k,  holds  after  every 

2 

extend  operation,  we  know  that  f>2'~k  holds  after  every  segment  is  executed  /  =1,2, ....  Thus,  the 
overhead  from  executing  the  first  i  segments  is, 

icj<kfi  +  zy-'=kfi  +  2‘-*-i 

7=1  7  =  1 


and  an  upper  bound  on  the  overhead  per  array  element  is, 
kfi+  2‘-k-l 


fi 


—  k  + 


2'~k  -  1 


fi 


<  k  + 


2‘~k  -  1 
o i—k 


<k  +  1 


Next,  we  show  that  this  bound  is  realizable.  Starting  from  an  initial  array  Q  of  unit  size,  we 
perform  (i  +  I  )k  segments  as  follows.  First,  for  each  dimension  perform  i  segments,  each 

doubling  dimension  j.  Begin  a  new  segment  by  performing  successive  extend  operations  until  the 
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entire  array  is  filled,  so  that  it  contains  2,k  entries.  The  total  overhead  to  this  point  is  2lk  —  1. 
Next,  perform  one  extend  operation  in  each  dimension,  causing  additional  overhead  costing  at  least 
k  2,k  for  a  cumulative  total  overhead  of  at  least  (, k+l)2lk .  Thus,  we  obtain  a  lower  bound  Yl(k)  on 
the  overhead  per  array  element.  ■ 

Hoffmann  and  O’Donnell  did  not  consider  dynamic  arrays,  and  their  pessimistic  analysis  sug¬ 
gests  that  they  simply  preallocated  enough  space  to  accomodate  worst  case  instances.  Although 
Chase  used  algorithms  that  required  dynamic  multi-dimensional  arrays,  he  did  not  analyze  this 
cost,  nor  did  he  make  use  of  unit-time  initialization.  In  the  next  section  we  will  use  Lemma  3  to 
show  that  the  overhead  due  to  array  doubling  accounts  for  only  a  fraction  of  the  total  time  for  full 
pattern  preprocessing.  However,  we  do  pay  a  price  in  space.  Based  on  the  proof  of  Lemma  3,  the 
final  space  allocation  of  a  dynamic  k-dimensional  array  can  be  2k  times  the  number  of  entries  in  the 
array.  Of  course,  any  overallocation  during  preprocessing  is  not  needed  for  matching  and  can  be 
shed. 


domain  / 


range/ 


_  f(x)  =  y 
i - -  y 


array 


preimage  node- record 

count  for  R 


I.r1{.y}l 

i  is  numeric  code  for  x  e  D 


Fig. 5.  Implementation  of  SE_ map  f:  D  —>  R 


3.3.  Abstract  Algorithm 

Let  F  be  the  set  of  function  symbols  appearing  in  PF.  For  each  function  f  e  F,  let  A  (f)  be  its 
arity.  Let  Y  be  the  set  of  Hoffmann  and  O’Donnell  match  sets.  From  the  above  discussion,  we 
know  that  the  following  equations  hold: 

(3)  r=j{v,  ij:se  PF  I  s  is  a  leaf }  u  u  /{range  Qf.  /  e  F  I  A  (f)  >  0} 
n}=  {q: / (c !,  ...,  ck)  g  PF} 

\i'f=  {[m,  m  n  YVf\ :  me  T } 

0/=  ([[mi,  ...,  mk\,  m]:  m  ,  e  range  pj,  ...,  mk  e  range  p)  } 
where  m  =  {f(c ...,  ck)  e  PF  Ic,g  mh  i  =  1,  ....  k}  u  { v } 

Because  the  preceding  equations  contain  a  cyclic  dependency  in  which  Y  depends  on  both  PF  and 
0,  p  depends  on  T,  and  0  depends  on  p  and  PF,  it  would  seem  that  a  costly  fixed  point  iteration  is 
needed  to  maintain  these  equations  when  PF  is  modified.  Fortunately,  this  can  be  avoided  with 
careful  scheduling. 
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The  algorithm  also  depends  on  a  careful  logical  organization  of  the  data  into  S£_structures 
and  SZwnaps.  Recall  that  sets  range  iij  represent  Chase  match  sets  for  fe  F  and  i  =  1,  ....  A(f). 
We  will  use  numeric  S£_structure  ( PF ,  T,  P,  v).  Simple  numeric  S£_structure  (PF,  range  \i'f,  .,  v) 
and  ,S'£_map  p):  T  — >  range  p)  for  /  e  F  and  /  =  1,  A(f),  and  multi-dimensional  .S'£juap  0p 

A  if) 

x  range  p)  — >  T  for  each  fe  F.  Fig.  6  describes  the  data  structures  used  to  access  the  main 
i= 1 

S£_structures  and  .S'£„maps  shown  in  Fig.  7  (with  array  implementations  indicated).  Note  that  all 
of  the  SPjnaps  p)  are  defined  on  a  shared  .S'£_sct  F  and  are  accessed  through  an  array  shown  in 
Fig.  7.  Note  also  that  the  ££_rccords  for  the  .S'£„structure(££,  T,  P,  v)  (see  Fig.  6)  spread  the 
standard  PF  field  into  two  fields  -  an  F  field  for  the  function  symbol  and  a  slice  field  for  the  argu¬ 
ments  of  the  function.  For  example,  a  pattern  f  (t e  PF  would  have  a  pointer  to  symbol / in 
the  F  field  and  pointers  to  arguments  accessible  from  the  succ  field. 


Fig. 6.  Core  data  structure 

It  is  useful  to  explain  our  incremental  algorithm  in  terms  of  three  cases.  Our  analysis  of  indi¬ 
vidual  operations  will  ignore  overhead  costs  involving  dynamic  arrays.  Overhead  will  be  con¬ 
sidered  afterwards. 

(case  1)  Assume,  first  of  all,  that  the  set  of  patterns  P  is  initially  empty.  It  is  also  convenient 
to  assume  that  pattern  forest  PF  (but  not  P)  always  contains  v.  Then  in  0(1)  time  and  space  we  can 
initialize  variables  T.  IT,  p,  and  0  as  follows: 


PF:=  jv} 
F :=  { { v} } 
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n  :=  {} 

P  :=  { } 

6v  :=  iv) 

Next,  suppose  that  P  is  augmented  by  a  new  pattern  p.  In  order  to  re-establish  PF,  we  add  to 
PF  those  subpatterns  of  p  not  already  in  PF  in  an  innermost-to-outermost  order.  Because  of  the 
order  in  which  updates  are  scheduled,  we  know  that  immediately  before  a  subpattern  q  of  p  is 
added  to  PF ,  either  q  is  a  leaf  or  all  the  subpatterns  of  q  except  for  q  itself  already  belong  to  PF. 
More  importantly  we  know  that  q  is  not  the  subpattern  of  any  other  pattern  belonging  to  PF. 

(case  2)  Suppose  PF  is  augmented  with  a  constant  symbol  c.  In  this  case,  we  can  maintain 
the  system  of  equations  (3)  by  executing  the  following  code  just  before  the  modification  PF  with:= 
c: 

0r  :=  0,, 

r  with:=  0,  with  c 
0,  with:=  c 

for#  e  F,  f=l,. ...A  (g)  loop 
if  0,,  e  domain  Lt',  then 
g((0v  with  c)  :=  p((0,,) 

end 

end 

In  effect  the  preceding  code  can  be  implemented  by  performing  a  modify  domain(Qv ,  c)  operation 

on  p',  for  each  ge  F  and  /  =  1 . A(g)  such  that  ve  IT',.  (Recall  that  0C  =  0V  if  c  4  PF.)  In  order  to 

implement  the  for-loop  efficiently,  we  can  use  an  index  \i_thread  =  { [x,[/,y]]: 

g  e  F,  j=l,...,A(g),  x  e  domain  p',  J ,  which  maps  Hoffmann  and  O’Donnell  match  sets  m  e  V 
(where  in  this  case  m  =  0V)  to  conversion  maps  p',  whose  domain  contains  m.  In  order  to  update 
the  conversion  maps  efficiently,  we  implement  p _thread  by  maintaining  a  single  doubly  linked  list 
for  every  m  e  T  threading  each  occurrence  of  m  within  every  set  domain  p'„  over  all  f/,g] 
e  \i_thread{ 0,  }.  For  example,  in  Fig.  7  the  thread  for  match  set  m  e  T  passes  through  entries  in 
column  t  of  arrays  implementing  domain  p(,  for  each  g  e  F  and  /  =  1  ,..,A  (g)  such  that  m  e 
domain  p'„.  Since  this  operation  augments  V.  the  arrays  implementing  the  domains  of  the  conver¬ 
sion  maps  can  double  their  size.  Double  links  allow  the  thread  to  be  adjusted  in  unit  time  when¬ 
ever  an  element  in  the  thread  is  added,  deleted,  or  moved  (which  occurs  during  array  doubling). 
By  Lemma  2  the  time  to  perform  the  preceding  modify  domain  operation  (not  including  the  cost  of 
array  doubling)  is  0(\\xjkread{Qv}\). 

(case  3)  The  third  and  more  difficult  case  to  consider  is  when  PF  is  augmented  with  pattern 
fit],  ...,  tk),  where  k> 0.  Below  we  describe  how  a  two-stage  cascade  of  updates  can  be  used  to 
propagate  modifications  to  each  of  the  variables  T,  II,  p,  and  0  in  order  to  re-establish  equations 
(3).  Recall  that  set  T  and  each  of  the  sets  range  p),  /  e  F,i  =  1,  ...,  A  if),  will  be  implemented  as 
SFjxees. 
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Fig. 7.  Data  structure  for  0/  and  pf 

1.  In  0(k )  time  update  Id,  before  the  modification  PF  with:=/(tls  tf).  (Note  that  the 
array  implementing  Fly  can  double  when  PF  is  augmented.) 

for  j  =  1, ...,  k  loop 
if  tj  i  Ft}  then 
I  If  with:=  tj 

end 

end 

The  preceding  code  gives  rise  to  Stage-One  updates.  Each  modification  Ilf  with:=  tj  to  pro¬ 
jection  ITf  makes  the  equation  for  ITf  hold  for  the  new  value  of  PF ,  but  falsifies  the  equation  for 
pf.  In  order  to  re-establish  the  equation  for  pf  with  respect  to  the  new  value  of  ITf  (but  not  the  new 
value  of  PF),  we  perform  a  modify  range  operation  on  pf.  However,  modification  to  the  range  of 
pf  falsifies  the  equation  for  0y.  We  re-establish  this  equation  for  the  new  value  of  ITf  (but  not  the 
new  value  of  PF)  by  executing  modify  domain  operations  on  0y.  The  Stage-Two  updates  establish 
all  equations  for  the  new  value  of  PF.  Details  for  Stage-One  are  given  just  below. 

2.  Perform  a  modify  range({m  e  Y  I  tj  e  m } ,  tj)  operation  on  pf  immediately  prior  to  the 
modification  ITf  with:=  tj  of  Step  1: 

for  111  G  r  I  tj  g  m  loop 
pf(m)  with:=  tj 

end 

As  discussed  in  .S'£_trcc  operation  5,  set  T_index{tj),  which  is  obtained  from  the 
PF_record  for  symbol  tj  (see  Fig.  6),  is  used  to  retrieve  the  subset  jraefl  tj  e  m }  of 


-  19- 


/iw/c'_rccords  in  the  numeric  SE_tiee(PF,  T,  P )  (see  Fig.  7).  The  numeric  codes  in  these 
node_ records  are  used  to  access  the  array  for  domain  | ij  (see  Fig.  7).  By  Lemmas  1  and 
2,  the  cost  of  executing  this  step  is  (2(1  {me  T I  fe  m  }l).  Although  add(\lj(m ),  tj)  opera¬ 
tions  used  to  implement  modify  range  can  cause  the  range  of  the  jth  dimension  of  the 
array  storing  0,  to  double,  we  will  charge  such  overhead  to  construction  costs  for  0/. 

3.  Perform  a  modify  domain /([mi,...,mj,  tf)  operation  for  each  \m  | . mk]  e  domain  0;, 

where  mj  =  pj(m),  prior  to  each  add(\xj(m ),  tj)  operation  used  to  implement  the 
modify  range  of  Step  2,  but  just  after  any  doubling  of  multi-dimensional  array  0,  that 
might  result  from  augmenting  range  pj.  Recall  that  the  modify  domain  is  implicit  (i.e., 
implemented  at  no  cost)  whenever  the  modify  range  of  Step  2  is  implemented  using 
replace. 

for  [mi,  ...,  nij,  ...,  mk]  e  domain  Qf  I  mj  =  \jJfni)  loop 

Qf[tn  j ,  ...,  m j  with  tj,  ...,  mk )  :=  Qjim  , ,  ....  mj,  ...,  mk) 

end 

Flere  0y(m ! ,  mj  with  tj,  ...,  mk)  =  Qf(m\,  mj,  mk),  because  the  pattern 
/  (t  | .  ...,  4)  has  not  yet  been  added  to  PF,  and  so  no  /-pattern  in  PF  has  /  as  its  j th  argu¬ 
ment.  Since  range  0 f  is  unchanged,  T  is  unchanged  also.  Flence,  the  three  preceding 
steps  establish  all  equations  relative  to  the  new  value  of  II  j  for  i =1,..,A  if). 

This  operation  can  be  implemented  naively  by  an  exhaustive  search  in  which  every  entry  in  a 
^-dimensional  array  implementation  of  0,  with  value  m;  in  the  j- th  dimension  is  copied  to  a  new 
position  differing  only  from  the  old  position  by  index  value  nij  with  /  in  the  jth  dimension.  Alter¬ 
natively,  if  the  domain  of  0;  is  sparse,  we  can  speed  up  the  search  by  using  k  indexes 
{ [/«;, [m  i ,  ...,  mk]\:  [m1?  ...,  mk]  e  domain  0;  }  i  =  1,  ...,  k.  Flowever,  the  indexes  do  not  need  to 
store  ^-tuples  explictly.  Each  index  can  be  implemented  efficiently  as  lists  threading  elements  of 
Qf.  That  is,  each  Chase  match  set  me  range  \lj  has  a  pointer  to  a  threaded  list  of  entries 
Qj{m\,  ...,  mk)  such  that  \m  i,  ...,  mk]  e  domain  0;  and  m/  =  m.  A  simple  address  calculation  can 
then  be  used  for  copying.  After  each  copy  we  need  to  update  k  threads  for  the  k  indexes  in  Oik) 
time.  Thus,  our  sparse  implementation  together  with  Lemma  2  lets  us  perform  this  operation  in 
time  proportional  to  the  number  of  copy  operations  times  k. 

The  Stage-Two  updates  result  from  modification  to  PF.  First  we  execute  a  modify  range 
operation  on  Qf  in  order  to  re-establish  the  equation  for  0/  relative  to  the  new  value  of  PF.  Because 
updating  the  range  of  Qf  falsifies  equations  (3)  for  certain  of  the  conversion  maps  ,  we  need  to 
perform  modify  domain  operations  on  these  maps.  Consequently,  after  Stage-Two  all  of  the  equa¬ 
tions  (3)  hold  relative  to  the  new  value  of  PF.  Details  for  Stage-Two  are  given  below. 
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k 

4.  Perform  a  modify  range({\m\,  ...,  mk]  e  x  range  u)  I  4  e  mh  4  s mk }, 

i= 1 

/(?!,  4))  operation  on  0,  just  before  the  modification  PF  vvith:=  f  (t  ] ,  4)  and 

after  the  preceding  three  steps: 

for  m  1  e  range  p} . mk  e  range  p)  I  r,  e  in , ,  4s  w*  loop 

if  [w  j ,  ....  mk  \  4  domain  0,  then 
Bj(m  1 ,  :=  0V 

end 

....  mk)  with:=/ (4,  4) 

end 

Whenever  a  new  ft-tuple  is  added  to  the  domain  of  0y,  we  also  need  to  update  the  k 
threaded  indexes  used  in  the  sparse  implementation  for  Step  3.  Fortunately,  this  0(k ) 
maintenance  operation  is  performed  only  once  for  each  element  in  domain  0,  .  We  can 
use  set  range  \ilf_index  (tf)  to  search  through  the  sets  { my  e  range  pj  I  tj  e  nij }  (which 
must  be  nonempty  because  tj  was  previously  added  to  some  match  set  in  T,  and  because 
Step  2  added  tj  to  range  p|)  instead  of  the  potentially  much  larger  sets  range 
pj-,  j  =  1,  k.  Flowever,  this  step  contains  a  new  operation  to  create  a  £-tuple 
[m  1,  mk]  and  locate  it  in  the  domain  of  0,.  Flashing  is  a  practical  solution  with  good 
space  utilization  and  good  expected  time.  This  would  also  make  the  Bottom-Up  Step 
0(k )  expected  time.  Our  current  implementation  uses  this  approach.  Another  way  of 
keeping  space  costs  down  at  the  expense  of  time  is  to  use  a  balanced  search  tree;  e.g.,  a 
red/black  tree  [36].  Accessing  the  domain  of  the  transition  map  0,  then  takes 
0{k  log(ldomain  8/\))  time,  and  so  does  the  Bottom-Up  Step.  Like  Chase  we  can  also 
use  a  k-dimensional  array  to  store  0;,  which  doubles  its  size  and  reorganizes  whenever  it 
overflows.  In  this  case  the  running  time  for  this  operation  is  proportional  to  the  number 
of  times  0y  is  updated  by  Lemmas  1  and  2.  A  constant  factor  k  is  avoided  in  each  array 
access  by  using  strength  reduction. 

5.  Add  a  new  code  for  a  match  set  to  Y  prior  to  each  add  operation  that  results  from  execut¬ 
ing  0 j{in  1 ,  ...,  mk)  with:=  f  (t  \ ,  4)  within  the  modify  range  of  Step  4.  The  old  match 

set  code  is  reused  when  the  modify  range  of  Step  4  is  implemented  using  replace. 

r  with:=  Qfml,  ....  mk)  with  fit  1.  4) 

This  operation  can  cause  the  arrays  implementing  the  domains  of  conversion  maps  to 
double.  Since  pattern  f  ft  1,  ...,  4)  is  newly  added  to  PF,  it  is  not  a  subpattern  of  any 
other  pattern  in  PF.  Thus  no  further  modification  is  needed  for  II. 

6.  Just  before  each  addiQfim  ...,  mk),  f(t\,  ...,  4))  operation  used  to  implement  the 

modify  range  of  Step  4,  perform  a  modify  domainfd fm  j .  ...,  mk),f(t  1,  4))  operation 


-21  - 


on  p(,  for  g  e  F  and  j=l,...,A  (f)  such  that  Chase  match  set  0y(m  j,  mk)  belongs  to  the 
domain  of  p(, .  An  implicit  modify  domain  is  performed  (at  no  cost)  for  each  replace  used 
to  implement  modify  range  in  Step  4. 

forg  e  F,  j=\,...,A(f)  loop 

if  Qf{m , . mk)  e  domain  \l{  then 

hi(0/(mi,  mk)  with  f  it •••>  h))  '■=  bi(6/('«n  mk)) 

end 

end 

Observe  that  within  the  preceding  code  \iJg(Qf(mi,  mk)  with 
\l'g(Qf(m  | ,  mk)),  because  f  (t\,  ...,  tk)  4  14),.  Since  the  range  of  p),  is  unchanged,  the 

equation  for  0,  remains  satisfied,  and  no  further  updates  are  necessary.  The  for-loop  is 
implemented  efficiently  using  the  \l_thread  index  described  in  case  2.  By  Lemma  2  the 
time  to  perform  this  operation  is  0(\\i_thread  {9 fyn  1?  ...,  mk)} I). 

The  preceding  discussion  combines  the  correctness  proof  with  the  design  description.  How¬ 
ever,  we  still  need  to  analyze  the  performance  of  full  batch  processing,  and  compare  our  results 
with  Chase’s.  In  both  Chase’s  and  our  algorithms  the  time  complexity  is  dominated  by  the  time 
needed  to  construct  the  maps  pj  and  0y,  where/e  F  and  j  =  1,  ...,  A(f).  However,  since  Chase[7] 
did  not  provide  complete  data  structuring  for  an  implementation  and  analysis,  the  comparison  is 
based  in  part  on  our  own  data  structures  (not  included  in  this  paper)  and  analysis  for  his  algorithm. 
In  the  following  theorem  we  let  /?  represent  the  total  number  of  distinct  g-patterns  in  PF  for 
g  e  F. 

Theorem  4. 

1.  For  each  me  T,  /  e  F,  and  j  =  l,  ....  A(f)  Chase’s  algorithm  computes  p}(m)  in 
Q(min(  I  m  I ,  I  n|l ))  time,  which  is  improved  by  our  algorithm  to  O  ( I  u}(m)  I )  time  when 
m  e  domain  pj  and  0(1)  time  otherwise.  By  coarser  analysis  the  total  preprocessing  time  contri¬ 
buted  by  p  is  Ofiri  kmax  l)  for  both  Chase  and  us. 

k 

2.  Let  function  symbol  fhave  arity  k  >  0.  For  each  \m  \ ,  ...,  mk\  e  x  range  p)  Chase’s  algo- 

i= 1 

rithm  computes  Qfm  | ,  ...,  mk)  in  £2(min(/y,  I  m  i  X  •  •  •  X  mk  I )  k)  time  if\m\,  ...,  mk]  belongs  to 
domain  0;  and  0(k )  time  otherwise.,  Our  algorithm  improves  this  bound  to 
O (k+\Qf(m\,  ...,  mk)\)  time  if  [mk,  ...,  mk\  belongs  to  domain  0;  and  0{k)  time  otherwise.  By 
coarser  analysis  the  total  preprocessing  time  contributed  by  0  is 
0{min(l  kmax2lkmax,  l  kmax  \r\kmax))  for  Chase  and 

O  (min(  (/  +  kmax) 2l  kmax,  (kmax  I F  I  +  /)  I T I  kmax))for  us. 

3.  We  use  0(wm(r))  auxiliary  space  to  represent  the  set  T,  whereas  Chase  uses  Ll( wp ( Fj) 
space.  When  we  include  the  threaded  lists  used  in  the  sparse  implementation  for  0,  our  total 
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k  Ik  i 

auxiliary  space  to  store  0  during  preprocessing  is  roughly  0((kmax  +  2  max)2  max  +  1  2  ).  Chase’s 

k 

space  is  comparable.  The  factor  of  2  '  is  due  to  overallocating  dynamic  arrays,  and  can  be  shed 

during  matching. 

4.  To  represent  pj  we  use  0( I  T I  +wn( range  p|) )  auxiliary  space,  whereas  Chase  uses 
£l(  I  T I  +wp(  range  pj) )  space.  By  a  coarse  analysis  for  total  preprocessing  space  contributed  by  p 
we  get  a  bound  of  0(  1  kmax  I T I )  for  both  Chase  and  us. 

Proof 

1.  For  each  m  e  T  Chase’s  algorithm  computes  pj(m )  =  rrinVlj  by  actually  intersecting  m 

and  II j,  which  takes  <T(min( I m  I,  IITj-1))  time.  We  avoid  computing  the  intersection,  and  spend 
only  0(1  pj(m)  I)  time  to  establish  the  value  of  pj(m )  for  each  m  e  domain  pj.  The  time  needed 
to  construct  these  conversion  maps  is  charged  to  modify  domain  operations,  modify  range  opera¬ 
tions,  and  overhead  for  dynamic  arrays  that  store  the  domains  of  these  maps.  By  preceding  discus¬ 
sion  of  Case  2  and  of  Case  3,  Step  6  in  our  algorithm,  we  know  that  the  cumulative  expense  of  exe¬ 
cuting  modify  domain  operations  on  each  conversion  map  pj  is  0(ldomain  pjl ),  which  includes  the 
cost  of  maintaining  index  p_thread.  By  preceding  discussion  of  Case  3,  Step  2  in  our  algorithm,  a 
coarse  upper  bound  on  the  total  cost  of  executing  modify  range  operations  is  0{  lpj(m)l) 

m  e  domain  |i( 

for  each  conversion  map  p j.  Since  the  domains  of  these  conversion  maps  all  use  dynamic  1- 
dimensional  arrays  with  index  values  ranging  from  1  to  in,  the  overhead  per  array  is  O(IFI)  by 
Lemma  3.  Combining  these  costs  yields  the  first  part.  To  obtain  a  coarse  upper  bound  on  the  time 
to  construct  all  of  the  conversion  maps,  we  use  the  following  inequalities:  I  pj(m )  I  <lg, 
\njg(m)\  <lg,  A(g)<kmax,  and  I  domain  pj  I  <  I  T I  forge  F  and  j=l,...,A(g).  Consequently, 
we  obtain  a  rough  upper  bound  O  (kmax  lg  I T I )  on  the  cumulative  charges  to  construct  all  conver¬ 
sion  maps  for  each  function  g  e  F.  The  result  follows. 

k 

2.  For  each  [mi,...,  mk]  e  x  range  p}  Chase’s  algorithm  computes  Qj(m  i ,  ...,  mk)  by 

evaluating  the  set  {/(cj,  ...,  cf)  e  PF  I  [cls  ...,  c^]  e  X  •••  X  mk }  naively,  which  takes 
£2(min(/y,  I  m  i  x  •  •  •  X  mk  I )  k)  time.  Roughly  speaking,  our  algorithm  assumes  that  the  initial 
value  of  Qf(m  ] ,  ...,  mk)  is  {v}  by  default.  Then  it  gets  new  values  in  the  modify  domain  operation 
of  Case  3,  Step  3  by  copying.  Each  copy  takes  0(k)  time  in  order  to  maintain  k  threaded  indexes. 
The  value  of  Qfm  \ ,  ...,  mk)  increases  one  element  at  a  time  in  the  modify  range  operation  of  Case 
3,  Step  4,  where  an  0(1)  time  per  element  is  a  coarse  upper  bound.  Thus,  we  spend 
0  ( I  8  fin  i ,  ... ,  mk)  I )  time  from  Step  4  and  another  Oik)  time  from  Step  3  (for  maintaining  sparse 

threaded  indexes)  to  establish  the  value  of  8fm  ] ,  ...,  mk).  By  Lemma  3  the  overhead  to  maintain 

k 

the  dynamic  k-dimensional  array  storing  0,  is  bounded  by  0  (k  I  x  range  pf  I ),  which  also  means 

7=1 

that  k  is  charged  to  every  unit  of  space  in  the  array  implementing  0;.  This  proves  the  first  part. 
Our  improvement  over  Chase  is  revealed  by  the  following  calculation: 

k  k 

1 8f(m  j ,  ...,  mk)  I  =  \{[qi,...,qk]:f  (qi,...,qk)  e  PFJnxm ,4  <min(lf,  I  xm,l) 
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To  prove  a  coarse  upper  bound  on  the  total  time  needed  to  construct  all  of  the  transition 
maps,  we  first  prove  a  time  bound  for  a  single  map  Of,  where /has  arity  k.  Since  Irange  ll/I  <=  2// , 
we  can  bound  the  overhead  costs  at  0{k  2  '  ).  Since  Idomain  0/1  <  2  the  total  cost  in  construct¬ 
ing  0/  is  0(ldomain  0/ 1  (/■  +  k  +  1)  +  k  2k  lf)  =  O  (2k  lf(lf  +  k)).  Alternatively,  since  we  also  know 
that  Irange  p/l  <  IFI,  then  another  bound  on  overhead  costs  is  O  (k  I T I k).  Since  Idomain 
0/1  <  \F\k,  another  bound  on  the  total  cost  in  constructing  0/  is  0(  I  Tl k  (lf+  k  +  1)  +  k  I F I A)  = 
O  ( I  Tl  k(lf  +  k)).  Summing  over  all  function  symbols  g  with  arity  greater  than  0,  we  obtain  the 
bound  O  (min(  I T I  k"wx(l  +  kmax  I F  I ),  (Z  +  kmax)2l  kmax))  on  the  total  cost  of  constructing  all  transi¬ 
tion  maps.  Analysis  for  Chase’s  algorithm  follows  similar  logic. 

3  and  4.  Follows  from  previous  analysis.  ■ 

The  fine  analysis  in  parts  1  and  2  of  the  preceding  theorem  reveal  our  asymptotic  advantage 
over  Chase’s  algorithm.  The  following  simple  calculation  illustrates  our  potential  space  advantage 
hinted  at  in  parts  3  and  4.  When  the  SE_t ree  implementing  T  is  a  full  binary  tree  with  weight  w  at 
each  node,  then  wn  =  IFI  and  wp = I  T I  log  IFI. 

4.  Elimination  of  gaps 

A  gap  in  the  SE_t ree  represents  a  set  of  patterns  which  is  not  a  match  set.  In  the  extreme  case, 
all  the  internal  nodes  except  the  root  could  be  gaps.  Thus  it  is  useful  to  consider  how  to  eliminate 
gaps  in  order  to  save  space. 

Consider  the  .S'£_trcc  implementing  .S' //structure  (PF,  F  .,  v).  For  convenience,  we  say  a 
pattern  q  labels  a  free  node  x  if  x  e  T -index  {q).  Thus,  if  Z  is  the  set  of  patterns  represented  by  a 
node  z  in  the  SF/tree,  then  Z  =  { q  e  PF  I  q  labels  an  ancestor  of  z }  ■ 

We  say  a  gap  in  the  S£_tree  is  maximal  if  its  parent  is  not  a  gap.  The  set  of  maximal  gaps 
can  be  computed  efficiently  if  we  add  a  parent  pointer  to  each  node  in  the  .S£_trcc.  We  say  an 
SZ/tree  is  compact  if  it  has  no  gaps.  If  M  is  a  finite  set  of  patterns,  we  use  gib  (M)  to  represent  the 
most  general  pattern  that  is  more  specific  than  any  pattern  in  M.  Lemma  13  in  the  Appendix  gives  a 
necessary  and  sufficient  condition  for  the  existence  of  gib  ( M ). 

Let  The  an  SE_tree  implementing  S //structure  {PF,  T.  .,  v),  and  let  T  be  the  new  SZ/tree 
that  results  from  T  due  to  the  insertion  of  a  new  pattern  p  into  PF  using  the  on-line  preprocessing 
algorithm  given  in  section  3.3.  Assuming  T  is  compact,  we  consider  how  to  make  T  compact  also. 
We  prove  the  following  lemma: 

LEMMA  5.  Ifx  is  a  gap  in  T,  then  every  descendant  ofx  is  either  a  gap  or  a  leaf  labeled  by  p. 

Proof  Let  A  be  the  set  of  patterns  represented  by  x.  According  to  Lemma  15  of  the  Appen¬ 
dix,  X  is  the  match  set  of  gib  (A)  before  p  is  added  to  PF.  After  p  is  added,  x  becomes  a  gap,  and  X 
is  no  longer  the  match  set  of  gib  {X).  Thus  X  u  {p}  must  be  the  match  set  of  gib  (A),  and  gib  (A)  < 
p.  This  implies  that  any  match  set  containing  A  must  also  contain  p.  Now  consider  a  descendant  y 
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of  x  in  T  that  is  not  labeled  by  p.  Let  The  the  set  of  patterns  represented  by  y.  Then  XcF.  Since  p 
is  a  new  pattern,  it  only  labels  leaves.  Thus  Y  does  not  contain  p.  Therefore  Y  is  not  a  match  set 
with  respect  to  PF  u  {p},  and  y  is  a  gap  in  T .  ■ 

If  we  label  the  maximal  gaps  by  p,  then  p  is  automatically  added  to  all  the  sets  represented  by 
the  gaps.  As  a  result,  each  node  whose  parent  is  a  gap  should  be  deleted  from  F -index  (p).  If  this 
node  is  a  new  leaf,  then  it  is  not  in  any  T -index  after  it  is  deleted  from  T-index{p )  and  must  be 
deleted  from  T  also.  Once  this  is  done,  every  node  in  T  represents  some  match  set  with  respect  to 
PF  u  {P},  and  there  are  no  gaps.  Obviously,  the  deletion  of  leaves  can  be  totally  avoided  if  we  do 
not  add  them  to  T  and  V-index(p)  in  the  first  place. 

In  Section  3.2  recall  the  two  cases  for  implementing  the  operation  modify-range( A,  z).  (1) 
For  each  range  element  y  e  f[ A]  whose  preimage  is  entirely  contained  in  A,  we  execute  a  replace 
operation  y  with:=  z  on  T.  (2)  For  each  element  y  e  /[A]  not  handled  in  case  (1),  we  execute  an 
add  operation  T  with:=  y  with  z. 

We  call  this  implementation  from  Section  3.2  the  basic  implementation.  To  avoid  introduc¬ 
ing  any  gap  into  the  S£_tree,  we  should  handle  Case  (1)  differently:  for  each  range  element 
y  e  f[ A]  whose  preimage  is  entirely  contained  in  A,  we  mark  v  as  a  gap;  for  each  maximal  gap  g, 
we  execute  a  destructive  replace  operation  g  with:=  z  on  T.  Case  (2)  is  handled  as  before.  This 
new  implementation  of  modify  range  is  called  the  compact  implementation. 

5.  Adaptation  to  Simple  Patterns 

Floffmann  and  O’Donnell  [20]  presented  an  algorithm  tailored  to  the  Simple  subclass  of  pat¬ 
terns  for  which  the  preprocessing  time  and  space  costs  for  bottom-up  multi-pattern  matching  are 
greatly  reduced. 

Definition:  A  pattern  forest  PF  is  Simple  if  for  every  two  distinct  patterns  p,  q  e  PF,  either 
(1)  p  <  q,  (2)  q  <  p,  or  (3)  3  subject  t  I  t  <  q  and  t  <  p.  A  set  P  of  patterns  is  Simple  if  its  pattern 
forest  is  Simple. 

For  Simple  Patterns  P  Floffmann  and  O’Donnell  observed  that  the  transitive  reduction  of  the 
partial  ordering  {PF,  <)  forms  a  directed  tree  (which  they  called  the  subsumption  free)  with  v  at  the 
root  (assuming  that  v  occurs  in  PF).  Each  match  set  equals  the  set  of  patterns  along  some  path  in 
the  subsumption  free  from  a  node  to  the  root.  And  every  path  from  a  node  to  the  root  determines  a 
match  set.  Thus,  there  are  only  /  match  sets,  and  each  one  can  be  represented  by  its  minimum  pat¬ 
tern.  For  a  function /of  arity  k,  the  transition  table  0,  uses  O (lk)  space,  a  great  improvement  over 
the  general  case  but  still  expensive.  Floffmann  and  O’Donnell  have  also  argued  that  most  sets  of 
patterns  they  have  encountered  in  rewriting  systems  are  Simple  or  can  be  turned  into  equivalent 
Simple  sets. 
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Hoffmann  and  O’Donnell’s  special  puipose  algorithm  for  Simple  Patterns  runs  in  preprocess¬ 
ing  time  0(kmax  I2+\F  I  h  l k "mx )  and  space  0(l2+ 1  F  I  lkmax),  where  h  is  the  height  of  the  sub¬ 
sumption  tree.  They  also  presented  a  test  deciding  whether  a  given  set  of  patterns  is  Simple  with 
time  0(kmax  l2)  and  space  Oil1). 

Our  algorithm,  presented  in  the  preceding  section,  adapts  favorably  to  problem  instances  in 
the  class  of  Simple  Patterns.  For  Simple  Patterns  our  incremental  algorithm  has  better  asymptotic 
performance  than  Hoffmann  and  O’Donnell’s  nonincremental  special  puipose  algorithm. 

COROLLARY  6.  For  Simple  Patterns  the  preprocessing  costs  of  our  algorithm  are 
0(kmax  l2+(h+kmax)  lk'nax)  time  and  0(1  kmax  fid  +  h)  +  (kmax  +  2k,nax)  I kmax )  space.  The  space 

k 

bound  can  be  improved  to  0(1  kmax  (\F\  +  h)  +  kmax  1  max)  during  matching. 

Proof  Since  in  =  /  for  Simple  Patterns  Theorem  4(1)  says  that  the  time  contributed  by  all 
conversion  maps  p  is  0(kmax  l2). 

Next  we  determine  the  time  contribution  of  the  transition  maps  0.  When  PF  is  Simple,  each 
match  set,  and  so  each  Chase  match  set,  is  1  inear  ordered  in  the  subsumption  tree.  Thus,  each  Chase 
match  set  can  be  represented  by  its  minimal  element,  and  there  can  be  no  more  than  in|l  <  If  such 
minimal  elements  for  each  fe  F,  and  each  j  =  1,  ...,  A  (/').  Since  PF  is  Simple,  for  any  match  set 
m,  \m\  <  h.  Then  by  Theorem  4  (2.),  the  total  time  bound  contributed  by  all  transition  maps  0y  over 

all  function  symbols  feF  is  0(  £  (h+kmax)lkfmax)  =  0((h+kmax)(  £  lf)kmax)  =  0((h+kmax)lkmax). 

feF 

By  Theorem  4  (3.),  the  auxiliary  space  needed  to  store  T  is  0( wn (T))  =  0(1  h).  Since,  by 
preceding  analysis,  the  size  of  each  dimension  of  the  array  storing  0,  is  bounded  by  If,  then  the 
space  used  to  store  all  of  the  transition  maps  0  together  with  the  threaded  lists  is  roughly 
0((kmax  +  2kmax )  lkmax).  Space  2kmax  lkmax  accounts  for  overallocating  dynamic  arrays,  and  can  be 
removed  for  matching.  Since  the  space  needed  to  store  each  conversion  map  pj  is  0(l+wn( range 
p|j)  =  0(1+1 f  h),  then  the  total  space  utilization  for  all  conversion  maps  is  roughly  0(1  kmax  (IFI  + 
h)).  m 

A  slight  modification  to  our  algorithm  further  reduces  the  space  needed  to  store  T  and  each 
conversion  map  to  0(1)  without  sacrificing  our  time/space  bounds  for  the  general  problem. 

Let  T  be  a  compact  SFjxee  implementing  the  SF_structure  (PF,T...v),  and  let  T  be  the  new 
S£_tree  that  results  from  T  due  to  the  insertion  of  p  into  PF  using  the  on-line  preprocessing  algo¬ 
rithm  described  in  section  3.3.  Assume  that  PF  is  Simple.  Then  there  are  /  nodes  in  T  and  l  T- 
index  sets. 

We  say  a  node  x  in  T  (therefore  also  in  T)  is  affected  if  it  represents  a  set  X  of  patterns  such 
that  X  u  {p}  is  a  new  match  set  w.r.t.  PF  u  {p}.  Note  that  if  x  is  affected,  then  either  x  has  a  child 
labeled  by  p  in  T .  or  x  itself  is  labeled  by  p  in  T .  An  affected  node  x  is  maximum  if  all  affected 
nodes  of  T  are  descendants  of  x. 
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We  say  an  compact  SE_ tree  is  reduced  if  each  of  the  tree  nodes  belongs  to  exactly  one  T- 
index.  Thus,  if  T  is  reduced,  then  each  Y -index  contains  exactly  one  node  in  T,  and  the  total  space 
needed  for  the  tree  nodes  and  T -index  sets  is  0(1).  We  assume  that  T  is  reduced,  and  consider  how 
to  make  P  reduced  in  case  PF  u  {/;}  is  also  Simple. 

LEMMA  7.  If  T  is  reduced,  then  the  following  properties  hold. 

1.  Ifn\  G  Y -index  (p  \)  and  n  2  e  Y -index  (p2)  are  bvo  nodes  in  T  such  that  node  is  the 
parent  of  node  n2,  then  p2  <  p  1  • 

2.  T forms  the  subsumption  tree  ofPF  before  p  is  added. 

3.  There  exists  a  maximum  affected  node  in  T. 

4.  The  maximum  affected  node  is  not  a  gap  and  not  labeled  by  p  in  T . 

5.  PF  u  {p}  is  Simple  iff  all  the  affected  nodes  in  T  except  the  maximum  one  are  either  gaps 
or  leaves  labeled  by  p  in  T . 

Proof 

1.  Since  n  \  is  the  parent  of  n2.  then  there  is  a  match  set  containing  both  p  \  and  p2.  Therefore 
either  p  \  <  p2  or  p 2  <  p  \  .  Since  n  \  represents  a  match  set  containing  p  \  but  not  p2,  then  p 2  < 
Pi- 

2.  This  follows  immediately  from  Property  1. 

3.  Let  n  \  and  n2  be  two  different  affected  nodes  representing  the  two  match  sets  N ]  and  N 2 
respectively  before  the  insertion  of  p.  Then  the  nearest  common  ancestor  x  of  n  ]  and  n2  represents 
the  match  set  X  =  Ni  n  N2.  Since  n  \  and  n2  are  affected,  then  after  the  insertion,  there  is  a  match 
set  Mi  =  Ni  u  {p}  and  another  match  set  M2  =  N2  u  {p}.  Then  M {  n  M2  =  X  u  {p}  is  also  a 
match  set  (see  Lemma  16,  Appendix).  Thus  x  is  affected.  This  means  that  the  nearest  common 
ancestor  of  any  two  affected  nodes  is  also  affected,  and  there  must  be  a  unique  maximum  affected 
node. 

4.  Let  x  be  a  node  in  T.  Then  x  has  a  label  q  T  p.  We  need  to  show  that  if  x  is  a  gap  or  is 
labeled  by  p  in  T ,  then  x  cannot  be  the  maximum  affected  node.  Let  X  be  the  set  of  patterns  in  PF 
represented  by  x  before  adding  p.  Before  adding  p  to  PF,  A  is  a  match  set  of  q.  After  adding  p,  X  is 
no  longer  a  match  set.  This  means  that  X  u  jp)  is  a  match  set  of  q.  Therefore  q  <  p,  and  there 
must  be  some  match  set  M  that  contains  p  but  not  q.  Let  m  be  the  node  in  7y  representing  M.  Then 
either  m  or  its  parent  is  affected.  Since  neither  in  nor  its  parent  can  be  a  descendant  of  x,  then  x  is 
not  maximum. 

5.  =>  Suppose  PF  u  { p }  is  Simple.  Let  x  e  Y-index(q)  be  an  affected  node  that  is  not  a  gap 
and  not  labeled  by  p  in  T .  Then  x  has  a  child  m  labeled  by  p.  According  to  the  proof  of  property  1, 
we  have  q  >  p.  This  means  every  match  set  containing  p  also  contains  q.  Thus  x  is  the  maximum 
affected  node. 

<=  Suppose  all  the  affected  nodes  except  the  maximum  one  m  are  either  gaps  or  labeled  by  p. 
Let  x  e  Y -index  (q)  be  a  node  in  7y  such  that  q  Tp.  Then  x  is  not  a  new  leaf.  We  need  to  show  that 
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either  (1)  q  >  p,  (2)  q  <  p,  or  (3)  p  and  q  cannot  be  in  the  same  match  set.  Consider  the  following 
cases.  If  x  is  a  proper  descendant  of  in,  then  x  is  either  a  gap  or  a  leaf  on  T-index  (p).  The  proof  of 
property  4  shows  that  q  <  p  in  this  case.  If  x  is  an  ancestor  of  m,  then  any  match  set  containing  p 
also  contains  q,  and  there  is  at  least  one  match  set  (for  example,  the  match  set  represented  by  x) 
that  contains  q  but  not  p.  Thus  q  >  p.  Otherwise,  x  is  neither  an  ancestor  nor  a  descendant  of  m.  In 
this  case,  neither  descendants  nor  ancestors  of  x  are  labeled  with  p.  Therefore  p  and  q  cannot  be 
contained  in  the  same  match  set.  ■ 

The  proof  of  property  5  also  tells  us  the  position  of  p  in  the  subsumption  free  of  PF  u  {/;}  if 
it  is  Simple:  p  must  be  a  child  of  the  pattern  labeling  the  maximal  affected  node,  and  an  ancestor  of 
patters  labeling  other  affected  nodes.  The  preceding  discussion  justifies  the  following  new  imple¬ 
mentation  of  modify  range(A,z),  which  we  call  the  reduced  implementation: 

If  PF  is  simple  and  there  is  only  one  element  m  e  /[A]  whose  preimage  is  not 
entirely  contained  in  A,  we  execute  an  add  operation  T  with:=  m  with  z  and  make 
all  the  affected  children  of  m  the  children  of  the  newly  created  node.  Otherwise, 

PF  u  {/;}  is  not  Simple,  and  we  execute  the  compact  implementation. 

THEOREM  2.  Whenever  PF  is  simple,  and  the  reduced  implementation  of  modify  range  is 
used,  then  the  on-line  preprocessing  algorithm  given  in  Section  3.3  maintains  the  invariant  that  the 
SE_tree  is  reduced,  and  is  consequently  the  subsumption  tree. 

Proof  Follows  immediately  from  Lemma  7.  ■ 

6.  Pattern  Deletion 

Deleting  patterns  from  P  can  be  handled  much  like  pattern  addition,  except  that  scheduling 
pattern  deletion  from  PF  is  in  an  outermost-to-innermost  subexpression  order.  Further,  a  pattern  is 
deleted  from  PF  only  if  it  is  not  the  argument  of  any  pattern  in  PF.  The  deletion  algorithm  follows 
the  same  logic  as  the  addition  algorithm  but  in  a  backwards  order  to  undo  the  effect  of  addition. 

To  delete  a  pattern  p  from  PF,  we  also  need  to  modify  the  SE_t ree  for  S£_structurc  (PF,  T.  P, 
v),  the  range  of  the  transition  map  0^,  and  the  domains  and  ranges  of  all  the  conversion  maps  pj.  If 
p  has  the  form  f(t\,  ...,  tf),  we  have  to  consider  whether  each  tt,  i  =  1,  ...,  k,  should  also  be 
deleted.  If  p  is  the  only  pattern  in  PF  with  function  symbol /whose  /th  child  is  tt,  then  we  have  to 
delete  tt  from  FI  'f,  and  then  modify  the  S£_tree  for  the  range  of  p}  and  the  domain  of  0y.  If  t,  is  not 
in  P  and  is  not  a  child  of  any  pattern  in  PF,  then  we  should  also  delete  q  from  PF  recursively. 

First  we  show  how  to  modify  SFjrees.  Since  all  the  SFjrees  can  be  handled  the  same  way, 
we  will  consider  the  ,S’£_trce  for  SF/structure  (PF,  T.  P,  v)  only.  Let  x  be  a  node  in  the  SE_t ree 
representing  a  match  set  X  that  contains  p.  After  p  is  deleted  from  PF,  x  represents  the  match  set  X' 
=  X  -  {p}.  The  question  is  whether  there  is  another  node  y  in  the  .S'£_trcc  representing  the  same  set 
X',  and  if  so,  how  should  we  merge  x  and  y. 
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To  answer  this  question,  we  need  two  additional  fields  for  each  node  x  in  the  .S' Entree  -  (1)  a 
parent  pointer  parent  (x)  pointing  to  the  parent  of  x,  and  (2)  a  label  list  field  label-list  (x)  storing  a 
list  of  patterns  in  PF  that  label  x.  The  label  lists  are  initially  empty.  Each  time  a  node  x  is  added  to 
T-index{p),  pattern  p  is  added  to  the  right  end  of  label-list  (x),  and  each  time  a  node  x  is  deleted 
from  T-index{p),  p  is  deleted  from  label-list  (x).  The  leftmost  element  of  a  list  is  called  the  head 
of  the  list. 

For  convenience,  we  also  use  the  following  notations.  We  assign  an  integer  age(q)  to  each 
pattern  q  in  PF  so  that  if  q  is  added  to  PF  by  the  /th  insertion  and  has  not  been  deleted,  then  age  (q) 
=  i.  Thus,  for  any  tree  node  x,  the  patterns  in  label-list  (x)  are  in  decreasing  order  of  their  ages  from 
left  to  right.  We  then  define  the  age  of  a  tree  node  x  to  be  the  age  of  head  {label-list  (x)).  Thus  it 
makes  sense  to  say  that  one  node  or  pattern  is  younger  or  older  than  another.  We  say  a  node  x  is 
normal  if  it  is  older  than  all  its  proper  descendants  and  has  a  different  age  than  any  of  its  siblings. 
It  is  not  difficult  to  see  that  if  all  nodes  in  the  SFjxee  are  normal,  then  different  tree  nodes 
represent  different  sets  of  patterns.  Thus,  our  main  concern  is  how  to  keep  every  node  in  the 
SFjree  normal  after  each  deletion.  The  solution  depends  on  the  way  that  patterns  are  inserted. 
We  assume  that  the  S£_tree  is  maintained  by  the  basic  implementation  of  modify  range.  In  this 
case,  the  youngest  tree  nodes  are  always  the  new  leaves,  and  each  internal  node  can  get  at  most  one 
new  child  (which  is  a  new  leaf)  for  each  new  pattern  added.  Therefore  the  S£_tree  resulting  from 
pure  insertions  has  the  following  properties: 

(1)  all  the  tree  nodes  are  normal; 

(2)  patterns  labeling  a  parent  are  older  then  patterns  labeling  its  children. 

These  two  properties  lead  to  the  deletion  algorithm  described  below. 

Let  p  be  the  pattern  just  deleted  from  PF.  Then  we  also  delete  p  from  the  label  list  of  each 
node  x  e  Y-index{p).  If  p  is  the  head  of  label-list  (x).  then  x  becomes  younger  and  may  no  longer 
be  normal.  For  each  such  possible  non-normal  node  x  with  parent  y,  we  store  a  pair  [x,  y]  into  the 
set  affected  and  temporarily  detach  x  from  y,  leaving  an  S£_tree  with  only  normal  nodes.  Then  we 
add  the  detached  nodes  back  to  the  S£_tree  one  by  one,  making  sure  that  no  non-normal  node 
results  from  this  addition: 

procedure  add_back()\ 

for  [x,  v]  e  affected  loop 
1  if  label-list  (x)  =  []  then 

for  c  e  children  (x)  loop 
make_cliild(c,  y); 
end  loop; 

else  make_child{x,  y); 

end  if; 
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end  loop; 

end  add_back; 

On  line  1,  we  find  that  label-list  {x)  is  empty,  which  implies  that  x  and  its  parent  y  represent  the 
same  set  of  patterns.  Consequently,  we  do  not  add  x  back  to  the  SE_t ree,  but  let  y  adopt  all  the 
children  of  x.  In  this  case,  we  say  that  x  is  merged  into  y.  The  procedure  make_child  (x,  y)  adds  x 
into  children  (y),  and  checks  whether  y  has  another  child  c  having  the  same  age  as  x.  If  there  is 
such  a  node  c,  x  and  c  are  combined.  Care  is  taken  to  ensure  that  Property  (1)  and  (2)  are  main¬ 
tained  for  each  tree  node.  Details  are  given  below. 

procedure  make_child  (x,  y); 

2  if  3  c  e  children  (y)  I  age  (c)  =  age  (x)  then 

prefix  :=  the  longest  common  prefix  of  label-list  (x)  and  label-list  (c); 

3  if  label-list  (x)  =  label-list  (c )  then 

for  z  g  children  (x)  loop 
make_child(z,  c); 
end  loop; 

elseif  prefix  =  label-list  (x)  then 
label-list  (c)  -:=  prefix; 
children  (v )  less:=  c; 
make_child{c,  x); 
elseif  prefix  =  label-list  (c)  then 
label-list  (x) -:=  prefix; 
make_child{x ,  c); 
else  /  :=  newnode  (); 

label-list  (f)  :=  prefix; 
make_child{t,  y); 
label-list  (x) -;=  prefix; 
make_child(x,  t); 
label-list  (c) —;=  prefix; 
children  (y)  less:=  c; 
make_child{c,  t); 

end  if; 

else  children  (y)  with:=  x; 
parent  (x)  :=  y; 

end  if; 

end; 

It  should  be  clear  that  make_child(x,  y)  does  not  change  the  set  of  patterns  represented  by 
either  x  or  y,  except  in  line  3,  where  we  find  that  x  and  c  represent  the  same  set  of  patterns  and 
therefore  merge  x  into  c.  Efficiency  can  be  improved  here  if  we  merge  the  node  having  fewer  chil¬ 
dren  into  the  other. 
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Modifying  the  conversion  maps  and  transition  maps  with  respect  to  pattern  deletion  is  much 
easier  than  it  is  with  respect  to  pattern  addition.  As  in  pattern  addition,  the  task  of  modifying  a 
map  consists  of  modifying  the  domain  and  range.  To  modify  the  domain  of  a  map  M,  we  simply 
delete  those  merged  nodes  or  tuples  containing  merged  nodes  from  domain  M.  The  space  released 
by  this  deletion  can  be  put  in  a  free  list  and  reused  later  (when  new  nodes  are  added  to  the  .S'Ejrcc 
as  a  result  of  pattern  insertion).  To  modify  the  range  of  a  map  M,  we  simply  replace  each  merged 
node  x  in  range  M  by  the  node  into  which  x  is  merged. 

Analysis  of  procedure  make_child  is  straightforward.  The  test  on  line  2  can  be  done  in  0(1) 
expected  time  if  children  (y)  are  hashed  by  the  head  of  their  label  lists.  (Maintaining  the  hash 
tables  increases  insertion  costs  by  0(1)  space  per  tree  node  and  0(1)  time  per  add  operation.)  If 
this  test  succeeds,  it  takes  another  0(\prefix\)  time  to  find  the  longest  common  prefix  prefix.  For  this 
cost,  we  reduce  the  total  size  of  label  lists  and,  therefore,  the  total  size  of  T- indices  by  \prefix\ .  The 
other  costs  are  0(1)  per  invocation  of  make_child,  where  the  total  number  of  invocations  is 
bounded  by  the  number  of  descendants  of  the  nodes  in  Y -index  {p).  Thus,  we  pay  0(1)  time  for 
each  match  set  from  which  p  is  deleted  plus  0(1)  time  for  each  deletion  of  nodes  from  T-indices. 

We  have  assumed  that  the  basic  implementation  of  modify  range  is  used  for  pattern  insertion. 
If  we  want  to  use  the  compact  implementation ,  then  it  may  happen  that  an  ancestor  has  a  label 
younger  than  some  of  its  descendants’  labels.  We  can  modify  the  procedure  make_child  to  accom¬ 
modate  this  situation,  but  we  do  not  know  how  to  bound  the  time  complexity.  Since  in  general,  it 
is  not  easy  to  check  whether  PF  is  Simple  after  each  deletion,  the  reduced  implementation  can  only 
be  used  in  a  very  limited  way:  once  PF  is  no  longer  Simple,  it  will  not  be  considered  Simple  again 
until  PF  contains  only  one  pattern  v. 

Finally,  we  want  to  make  some  comments  on  the  effect  of  pattern  deletions  on  the  amortized 
overhead  of  maintaining  a  dynamic  array.  Successive  deletions  of  elements  from  the  domain  of  an 
array  can  make  the  array  sparse.  To  improve  the  space  utilization,  we  can  halve  the  range  of  a 
dimension  whenever  the  load  factor  of  that  dimension  is  below  one  fourth.  Using  an  argument 
similar  to  the  proof  of  Lemma  3,  we  can  show  that  the  amortized  overhead  due  to  an  arbitrary 
sequence  of  doublings  and  halvings  of  a  k-dimensional  array  is  still  0{k )  for  each  entry  added  to 
the  array  stalling  from  the  unit  array. 

7.  Space/Time  tradeoff 

In  Chase’s  algorithm,  for  each  function  symbol  /  e  F  of  arity  k.  the  space  required  for  map  0, 
could  be  D.(2,fk).  Flere  we  give  a  method  that  decomposes  0,  into  q  maps  with  worst  case  space 
O  (q2lfk/q)  but  leads  to  time  O  (q)  to  solve  the  Bottom-Up  Step. 
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For  each  fe  PF,  let  PFf  be  the  set  of  subpatterns  in  PF  of  the  form  f  (x  \  ,...,xk).  Let  PFf  be 
partitioned  into  q  disjoint  equal  size  sets  PFf  \,...,PFfq,  and  consider  equations, 
n 'fj  =  {c,-:  /(ci,  •  •  •  ,ck)  e  PFf  j } 

( Xfj  =  {[m,  m  n  Wfj]:  me  T} 

0/j  =  {[[mi,  mk],  m]:  m  ,  e  range  p}r  •  •  • ,  mk  e  range  p,/j  } 
where  m  =  {f  (c i,  •  •  •  ,Cjt)  e  PFyj  lc,e  m,,/=l,...,k}  u  { v } 

We  modify  the  Bottom-Up  Step  as  follows.  Let  t  =  //1;  4)  be  a  subject  free.  Instead  of  com¬ 

puting  one  Chase  match  set  ms(q )  for  each  child  q  of  t  and  one  Floffmann  and  O’Donnell  match  set 
(FLO  match  set)  MS(t )  for  t,  we  compute  q  small  Chase  match  set  ms  ]  (q),  msq(tj )  for  each 
child  q  of  t  and  q  small  FLO  match  set  MS i(t),  MSq(t )  for  t  as  follows: 

msj(tj)  =  (q)) 

MSj(t )  =  Qf  j(mSj(t ! ) msj(tk )) 

Then  we  compute  the  FI-0  match  set  MS(t )  -  MS  \(t)  u  •  •  •  u  MSq(t).  This  disjoint  union 
can  be  computed  in  0(q)  time  either  by  hashing  or  by  table-looking.  If  table-lookup  is  used,  we 
need  a  union  table  Tf  that  maps  the  tuple  \MS  ]  (t),  ...,  MSq(t)]  to  the  union 
MS  |  (t )  u  ■  ■  ■  u  MSq{t).  Since  MSj{t)  c  PFf  j,  and  I  PFf  j  I  <  lf/q,  then  the  size  of  the  Tf  is 
O  =  O  (2lf  ). 

Consider  the  space  required  by  0;  tables.  If  r'f  j  =  I  range  p  j-  y  I .  then  rlf  j  =  0( 2in/jl)  = 
O  (2 1  PFfJ  1 )  =  0(  2lf/q),  and  I0yjl  =  O  =  0{  2lfk/q).  Thus,  the  total  space  storing  the  q 

Qf  tables  is  0(q2Ijk/q),  which  for  q  >  1  is  asymptotically  better  than  Chase’s  algorithm  in  the  worst 
case. 

Space  for  other  data  objects  are  as  follows. 

1.  .S' £_ tree  for  the  ranges  of  0  tables.  Since  each  set  x  in  range  0,  ,  is  a  subset  of  PFf  j ,  then 
I  range  0/  ;l  =  0{ 2lPFf'jl)  =  0{ 2,f/q).  Thus  the  space  of  the  S£_tree  encoding  range  0/  ;  is 
O  (2lj/q  lf/q).  Since  there  are  q  such  SE_t rees  for  /,  then  the  space  for  all  these  .S£_trccs  is 
O  (If  2lf/q).  If  the  partition  method  is  not  used,  we  have  one  SZJjree  encoding  range  Qf  which 
takes  O  (2lf  If)  space. 

2.  Similarly,  the  .S£_trcc  for  range  p}7  takes  up  0(2lf/q  lf/q)  space.  There  arc  k  q  such  frees 
for /with  Oilfk  2lj/q  )  cumulative  space  bound. 

3.  The  space  for  \lf  j  is  O  ( T)  =  O  (21 ).  There  are  q  k  such  maps  for  /  occupying  0(qk  21 ) 
space  altogether.. 

In  summary,  the  total  space  for  each  function  symbol  /  is  0(q  k  21  +  Ifk  2,f/q  +  q2Ifk/q). 

If  k  ?  , 

When  q  = - - - ,  we  obtain  the  approximate  minimum  0(lfk  2  /1).  Summing  over  all 

log  k  +  l  J 
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function  symbols,  we  get  the  overall  space  bound  O  (k2max  2/). 

This  upper  bound  can  be  further  improved  by  reducing  the  size  of  p  maps  and  the  union 
tables.  Let  PFf,  PFf  i,  II lfj,  and  0/;  be  defined  as  before.  We  split  each  map  p|j  into  smaller 
maps  with  domain  p^“  =  range  a,  where  a  =  0^,  g  e  F,  s  -  1,  ...,  q.  The  size  of  \i'j  “  is 

0(lrange  al)  =  0( 2Ig/q).  Summing  over  all  g  e  F,  s  =  1  j  =  1 . q,  i  =  1 . k ,  and  fe  F,  we 

get  an  upper  bound  0(kmax  q2  \F  \  2l/q)  for  the  total  space  needed  for  the  p  tables. 

Because  of  this  splitting,  the  Bottom-Up  Step  should  be  modified  accordingly.  Let 
f  (t  [  ...,  fi.)  be  a  subject  tree.  Assume  that  (•  =  g,(...).  As  before,  we  split  the  H-0  match  set  MS  (?) 
into  q  small  H-0  match  sets  MS  \(t),  ....  MSq(t),  and  split  each  Chase  match  set  ms  (tj)  into  q  small 
Chase  match  sets  ms  |  (?,■),  ...,  msq(tj).  The  small  H-0  match  sets  are  computed  as  before: 

MSj(t )  =  Qf  j(msj(t  | ),  ...,  msj(tk )) 

but  the  small  Chase  match  sets  are  computed  differently: 

msj(tj)  =  msj  i  (tf)  u  ■  ■  ■  u  msfq(tt ) 

where  msjs(tf)  -  pjJ(MS5(()),  (3  =  6gj  s.  Again,  the  disjoint  union  msj  |  (?,-)  u  ■  •  ■  u  msj  q(tj)  can 
be  computed  in  0(q)  time  either  by  hashing  or  table-lookup.  This  increases  the  time  per  step  to  q 2 . 
If  table-lookup  is  used  for  the  disjoint  union,  then  we  need  a  union  table  Tffj  to  map  the  tuple 
\msj  |  ((),  ...,  msj  q(tj)]  into  the  union  msj  j  (fi)  u  ■  ■  ■  u  msj  q(tj).  Let  ygj  S  =  PFg.  s  n  nj  j,  and 
let  yg.  =  PFg.  n  n'/  ;.  Since  msj^tj)  =  pjJ(MSv(t,))  c  PFgi  S  n  Wfj  =  yghS,  then  the  size  of  Tf8- 
is  0(2^g"l]  x  ...  x  2^g"q]  =  0{ 2lTft’lU  UYg,’?l  =  0(2lYftl).  Summing  over  all  the  function  sym¬ 
bols  gh  we  obtain  the  upper  bound  0( 2iri/j  l)  =  0(2lf/q)  for  the  total  space  for  the  union  tables  of 

the  form  Tlf*j.  Summing  this  space  further  for  /  =  I . q,  i  -  l,...,k,  and  fe  F,  we  get  upper  bound 

0(kmax  q  2l/q)  for  the  total  space  for  all  the  union  tables,  which  is  less  than  the  space  for  the  p 
tables. 

The  space  for  the  SE_ trees  and  0  tables  are  roughly  as  before.  Thus  the  overall  space  is 

0(kmaxq2l2l/«  +  q2lkma*/q). 

Since  this  approach  is  meaningful  only  for  step  time  complexities  better  than  0(1),  i.  e. , 
q  =  0(^1),  the  best  upper  bound  we  can  get  in  this  case  is  roughly  0(^1  2ck^)  for  some  constant 
c.  This  result  also  indicates  that  this  approach  is  useful  only  when  I T I  »  2  q . 

In  a  practical  implementation  it  is  not  necessary  for  PFf  to  be  partitioned  into  disjoint  equal 
size  subsets.  For  example,  we  can  let  PF*  \  be  the  set  of  patterns  that  are  not  children  of  any  pat¬ 
tern,  PF*j  be  the  set  of  children  of  patterns  in  PFti_l  not  contained  in  PF* j,  where 
i  =  L. maximum  height  of  patterns,  j  <  i.  Then  the  maps  pj”  can  be  omitted  for  a  =  Qgs,  where  s 
>  j+ 1.  Alternatively,  we  can  let  PF* ,  be  the  set  of  all  children  of  patterns  in  PF*  ,_| .  Now  the 
size  of  each  subset  may  grow,  but  the  maps  pj?“  can  be  omitted  for  all  a  =  0„  v,  where  s  ^j+ 1.  It 
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is  an  interesting  question  how  to  find  a  partition  of  PF  that  minimizes  the  map  size  for  a  fixed  per 
step  time  bound. 

8.  Match  set  elimination 

Aiming  for  a  bottom-up  pattern  matching  method  that  utilizes  space  efficiently  by  avoiding 
conversion  and  transition  maps.  Hoffmann  and  O’Donnell[20]  investigated  the  subclass  of  binary 
Simple  Patterns;  i.e.,  Simple  Patterns  in  which  the  maximum  arity  of  any  function  symbol  is  two. 
Although  greatly  restricted,  this  class  is  interesting,  because  conventional  arithmetic  and  operations 
in  combinatory  logic  have  arity  less  than  or  equal  to  two.  For  binary  Simple  Patterns  they  gave  an 
algorithm  requiring  no  transition  maps,  but  uses  O (l2)  space  for  both  preprocessing  and  computing 
MPTMp ,  0(1  h1)  preprocessing  time  (recall  that  h  is  the  longest  path  in  the  subsumption  tree),  and 
O  (lr2)  time  instead  of  O  (1)  time  for  the  Bottom-Up  Step  (1). 

Hoffmann  and  O’Donnell  also  considered  reducing  pattern  forests  to  equivalent  binary  form. 
For  each  function  symbol  fe  F  where  A  (f)  >  2,  introduce  a  new  function  symbol  twop  Transfor¬ 
mation  T 1  replaces  each  /-pattern  f  (x  i,...,xf)  in  PF  where  k> 2  by  f  (twoj{xi,x 2),x2,...,xf). 
Transformation  T 2  applies  T 1  repeatedly  until  it  can  no  longer  be  applied. 

The  following  lemma  states  without  proof  that  transformation  T 1  and,  consequently,  T  2  is 
correct. 

LEMMA  8.  Let  patterns  p'  and  q  be  formed  from  patterns  p,q  e  PF  by  transformation  T  I . 
Then  p<q  if  and  only  if  p<q . 

Although  it  is  correct,  transformation  T 2  may  not  always  be  usefully  applied.  Hoffmann  and 
O’Donnell  showed  that  T 2  sometimes,  but  not  always,  preserves  the  Simple  Pattern  property.  For 
a  counterexample,  consider  two  patterns  f  (x\,x2,xf)  and  / (y  i  OUOA)  in  a  Simple  pattern  forest 
PF.  If  x  j  >y  i ,  x  2  <y  2 ,  and  x  3  is  incomparable  with  y  3 ,  then  the  new  pattern  forest  that  results  from 
transformation  T 1  would  not  be  Simple,  because  of  twoj{x  1  ,x2)  and  twofy  \  ,y2 ). 

However,  we  can  give  an  interesting  class  of  pattern  forests  that  remains  Simple  under 
transformation  T 2.  A  Simple  pattern  forest  PF  is  Very  Simple  if  for  each  k- ary  function  symbol 
feF  with  k> 2  and  every  two  distinct  /-patterns  fix  \  ,...,xy)  and  f(y  i,...,yk),  we  know  that 
V  i=\,...,k-\  ((V  j=l,..,i  \Xj>yf)  and  (3  j =l,..,i  \xj>yf) )  >X;+1  <£  }’/+,). 

LEMMA  9.  Pattern  forest  PF  is  Very  Simple  if  and  only  if  the  pattern  forest  PF'  that  results 
from  transformation  T 1  is  Very  Simple. 

LEMMA  10.  If  a  Binary  pattern  forest  is  Simple,  then  it  is  also  Very  Simple. 

Proof  If  /  (x  | ,  y  ] )  and  f  (x2,  y  2)  are  any  two  /-patterns  in  PF  and  x  1  <  x2,  then  y2  £  y  \. 
Otherwise,  PF  would  not  be  Simple;  that  is,  we  would  ha ve/(xl5  y2)  </(xt,  y  1)  and /(x1;  y2)  < 
f(x2,y2).  m 
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The  preceding  lemmas  show  that 

THEOREM  11.  The  class  of  Very  Simple  Patterns  is  the  largest  subclass  of  Simple  Patterns 
for  which  transformation  T  2  preserves  Pattern  Forests  that  are  Simple. 

We  will  give  a  bottom- up  algorithm  for  binary  Simple  Patterns  with  0(1 )  space  to  compute 
MPTMp  and  O  (log  /)  time  to  compute  the  Bottom-Up  Step.  Our  preprocessing  time  and  space  are 
the  same  as  that  of  Hoffmann  and  O’Donnell.  The  algorithm  makes  use  of  persistant  search  trees 
[33],  and  we  expect  it  to  be  fast  in  practice. 

Let  PF  be  the  pattern  forest  for  a  set  P  of  Simple  Patterns,  and  let  T  be  its  subsumption  tree. 
Recall  that  for  Simple  Patterns  each  match  set  can  be  represented  by  the  unique  minimum  pattern 
in  the  set.  If  pf  represents  the  match  set  for  subpattern  t,  of  the  subject,  i  =  1  ..  k,  then  the  match  set 
for  f(t\,  ■  ■  ■  ,  4)  is  represented  by  the  pattern  determined  by  the  following  formula: 

(New  Bottom-Up  Step): 

(4)  min  /  ( { v }  u  {f(qx,  •  •  • ,  qk)  e  PF  I  qt  >ph  i  =  1  ..  k}) 

We  call  pattern  f  (p\,  •  •  • ,  pf)  the  search  argument  for  Step  (4). 

Consider  any  binary  function  /  appearing  in  PF,  and  let  /  (p  x ,  p  2)  be  the  search  argument  for 
Step  (4).  (We  will  not  discuss  unary  patterns  and  constants,  which  are  simpler  subcases.)  We  want 
to  analyze  (i)  the  worst  case  cost  of  performing  Step  (4);  and  (ii)  the  auxiliary  space  while  execut¬ 
ing  Step  (4). 

An  important  observation  is  that,  unlike  patterns  p  \  and  p2,  search  argument  f{p\,P2 ) 
might  not  belong  to  the  subsumption  tree  T!  Consequently,  if  we  let  1  >  v  denote  a  new  maximum 
pattern,  and  if  we  define  relation  R  =  {[.x,  y]:f(x,  y)  e  PF}  u  {[1,1]},  then  we  can  replace  Step 
(4)  for  search  argument  /  (p  i ,  p  2)  more  conveniently  by, 

(5)  min  /  { [x,  y  ]  e  R  I  x  >  p  1  and  y  >p  2 } 

If  [1,1]  is  the  answer  to  query  (5),  then  v  is  the  answer  to  query  (4);  otherwise,  if  | w,  z\  is  the 
answer  to  (5)  for  w,z^  1,  then/(w,  z)  is  the  answer  to  query  (4). 

Expression  (5)  can  be  computed  by  locating  the  pair  belonging  to  R  of  nearest  ancestors  of 
nodes  px  and  p2  with  respect  to  subsumption  tree  T.  This  characterization  is  meaningful  because 
of  Lemma  10. 

In  order  to  compute  (5)  efficiently,  the  difficulties  of  two  dimensional  ancestor  testing  and 
searching  within  partially  ordered  sets  need  to  be  overcome.  This  is  done  by  reducing  the  two 
dimensional  nearest  ancestor  search  in  tree  T  to  single  dimensional  searching  through  a  totally 
ordered  set.  The  essential  idea  is  presented  just  below. 

Let  R{x}  denote  the  set  [y :  [x,  y  ]  e  R } ,  and  let  domain  R  denote  the  set  TT *  =  {x:|x,y]e  R). 
Lor  each  xe  domain  R,  define  set  S(x)  =  u^- /?{}’];  for  each  z  e  Six)  define  witness 
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w(x,  z)  =  min/{  v  e  R~l  {z}  I  3^  >  x} 

Then  we  can  compute  (5)  by  performing  these  three  queries: 

(6)  i.  q  1  =  min  /{x  e  domain  R  I  x>p\} 

ii.  q2  =  min/{y  e  S(qx)  I  y  >p2}) 

iii.  q3=w(q  j,  q2) 

If  either  q  \  or  q2  equals  1,  then  v  is  the  answer  to  query  (4);  otherwise,  the  answer  is  /  (q 3,  q 2). 

The  three  queries  (6)  reduce  computation  (5)  to  finding  single  dimensional  nearest  ancestors 
and  computing  and  storing  sets  S  (x).  Nearest  ancestors  in  trees  can  be  computed  efficiently  based 
on  the  following  idea.  Let  pre  (i  )  and  des  ( i )  be  the  preorder  number  and  descendant  count  of  node  i 
in  tree  T.  Then  node  i  is  an  ancestor  of  node  j  iff  pre  (i  )  <  pre  (j)  <  pre  (i)  +  des  (/);  also,  if  i  and  k 
are  both  ancestors  of  j,  then  /  is  nearer  than  k  to  j  iff  pre  (i  )  >  pre  ( k ). 

Let  Q  be  any  subset  of  the  nodes  in  T.  Then  for  any  node  p  in  T,  we  can  compute 

(7)  min  /{x  e  Q  I  x  >  p] 

whenever  a  solution  exists  by  finding  the  node  i  in  Q  with  maximum  pre(i)  such  that 
pre  (i)  <  pre  (p)  <  pre  (, i )  +  des  (/).  To  facilitate  this  computation  we  can  preprocess  Q  as  follows. 
For  all  /  in  Q  define  function  find  ( pre  (/))  =  /.  Also,  for  all  ie  Q,  whenever  there  is  no  jeQ  such 
that  pre  (j )  =  pre{i)+des{i),  then  we  define  find  (pre  (i)  +  des  (i ))  to  be  the  nearest  ancestor  k  of  i 
belonging  to  Q\  i.e.,  the  node  ke  Q  such  that  pre  ( k )  is  the  maximum  for  which 
pre  ( k )  <  pre  (, i )  +  des  (, i )  <  pre  ( k )  +  des  (k).  Hence,  (7)  can  be  solved  by  computing  find  (z),  where 
z  is  the  greatest  element  in  domain  find  such  that  z  ^  pre(p). 

We  can  store  domain  find  as  either  a  red/black  tree  [16,36]  or  Willard’s  variant  of  the  Van 
Emde  Boas  priority  queue[37, 38]  and  obtain  the  following  time/space  bounds.  Both  data  struc¬ 
tures  use  space  0(\Q  I).  Computing  query  (7)  costs  0(log  \Q\)  time  with  red/black  trees,  and 
64  log  log  /)  time  with  priority  queues  (where  /  is  the  number  of  nodes  in  T). 

Based  on  the  preceding  analysis,  we  can  perform  query  (6),  (i.)  with  0(1)  cumulative  space  if 
we  store  all  of  the  domains  of  find  maps  for  each  binary  function  /e  F  either  as  red/black  bees  or 
Van  Emde  Boas  priority  queues.  Query  time  is  C(log  If)  using  red/black  bees,  (Xloglog  /)  with 
priority  queues. 

To  facilitate  query  (6),  (ii.)  and  (iii.)  we  can  combine  witnesses  and  find  maps  as  follows.  Let 
findx  be  the  find  map  for  S  (x).  Then  define 

findwx(z )  =  \w  (x,findx(z)),  findx(z )] 

We  can  store  all  these  findwx  maps  for  each  xe  fl  j  using  a  minor  variant  of  the  persistent  search 
tree  of  Sarnak  and  Tarjan  [33]  (see  also[ll]).  Recall  that  a  persistent  search  bee  can  store  a 

r— 1 

sequence  T0,  T\,  •  •  • ,  Tr  of  sets,  where  T0  is  empty  in  space  O(s)  in  which  s  =  £  I  r,Ar,+1  I  and 

r  =0 

A  represents  symmetric  difference.  It  can  also  support  the  nearest  neighbor  operation 
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pred  (i,  x)  =  max  /  {y  e  7j  I  y  <  x }  in  O  (log  s)  worst  case  time. 

Consider  the  sequence  [findw x :  .re  II*]  of  maps  ordered  according  to  a  preordering  of  Ill- 
relative  to  the  subsumption  tree  (where  the  empty  set  is  implicitly  the  first  member  of  the 
sequence).  Let  us  store  this  sequence  in  a  persistent  search  free  using  domain  values  of  the  findw 
maps  as  keys.  Since  the  sum  of  the  sizes  of  the  symmetric  differences  of  successive  maps  in  the 
sequence  is  bounded  by  0(  IR{x}l)  =  0(1 fi),  then  query  (6)  (ii.)  and  (iii.)  can  be  solved  in  time 

xelt} 

0(log  If).  If  q  i  is  the  answer  to  query  (6)  (i.),  then  the  pair  [q3,  q2]  =  findw qi(z)  solves  queries  (6) 
(ii.)  and  (iii.),  where  z  is  the  greatest  element  in  domain  findw  q  ]  such  that  zSpre(p2)-  The  cumu¬ 
lative  space  for  storing  findw  maps  in  persistent  search  trees  for  all  the  binary  functions  /e  F  is  just 
0(1). 

Preprocessing  for  solving  (6)  involves  constructing  the  subsumption  tree  T  and  computing 
preorder  and  descendant  numbers  (pre  and  des)  for  each  of  its  nodes.  Hoffmann  and  O’Donnell’s 
Algorithm  A[20]  decides  whether  PF  is  Simple  and  computes  the  transitive  closure  of  T  in  time 
0(1 2 k max )  and  space  0(1 2).  It  is  straightforward  to  modify  their  algorithm  to  decide  whether  PF  is 
Very  Simple  and  to  produce  T  without  changing  the  theoretical  complexity.  Once  T  is  available, 
pre  and  des  can  be  computed  in  0(1)  steps  (since  T  has  /  nodes). 

Preprocessing  for  (6)  (i.)  involves  computing  find  maps  over  set  II  j  for  each  function  symbol 
feF.  If  Id  j  is  preordered  with  respect  to  T,  we  can  compute  the  find  map  for  /  as  follows.  Pass 
through  Ilj  in  linear  time,  defining  find  (pre  (x))  to  be  x  for  each  .re  Id  \  encountered.  Recall  that 
we  also  need  to  compute  the  nearest  ancestor  of  x  in  II J  to  be  assigned  to  find  (pre  (x)+des  (x)) 
whenever  pre  (x)+des  (x)  is  not  the  preorder  number  of  some  node  y  e  Id  j  .  These  ancestors  can  be 
computed  by  stacking  the  anticipated  number  pre  (x)+des  (x)  together  with  the  ancestor  of  x  while 
searching  through  II*.  It  may  be  helpful  to  think  of  the  algorithm  as  processing  numbers  pre(x) 
as  left  parentheses  (which  are  all  distinct)  and  pre  (x)+des  (x)  as  balancing  right  parentheses  (which 
need  not  be  distinct  for  different  values  of  x).  Details  are  given  below. 

—Initialize  ancestor  to  be  the  artificial  top  element  of  all  nodes  in  T 
—whose  preorder  number  is  less  than  old_num  =  1+1;  its  ancestor 
—old_ancestor  is  undefined 
ancestor  :=  1 

—Handle  left  boundary  of  T  using  0  as  an  artificial  preorder  number 
find  (0)  :=  ancestor 
old_ancestor  :=  undefined 
old_num  :=  1+ 1 

stack  :=  [oId_ancestor,oId_num\ 

for  re  1 1 }  loop 

(while  old_num  <pre  (x)) 
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—old_num  is  the  pre  {y)+des  (y)  for  some  node  y  whose  nearest  ancestor 
—is  oId_ancestor 

find(old_num)  :=  old_ancestor 
pop  stack 

ancestor  :=  old_ancestor 
[old_num,old_ancestor]  :=  top  stack 

end 

if  old_num=pre  (x)  then 
pop  stack 

ancestor  :=  old_ancestor 
[old_num,old_ancestor]  :=  top  stack 

end 

find  ( pre  (x)):=x 

if  old_num^pre  (x)+des  (x)  then 

—This  test  guarantees  that  old_num  values  in  successive  stack  entries  must 
—be  distinct 

old_num  :=  pre  (x)+des  (x) 
old_ancestor  :=  ancestor 
push  [i old_num,old_ancestor ]  onto  stack 
end 

ancestor  :=x 

end 

(while  old_ancestor^undefined) 

—  Process  remaining  right  boundaries 
find  ( old_num )  :=old_ancestor 
pop  stack 

[old_num,old_ancestor\  :=  top  stack 

end 

Algorithm  Compute_find 

Algorithm  Compute_find  runs  in  0(1  f)  steps.  If  we  fold  in  the  code  to  store  domain  find  in  a 
red/black  tree,  the  preprocessing  time  is  Of //log  If).  In  a  single  preorder  traversal  of  T,  we  can 
preorder  the  elements  of  II \  for  all  functions  fe  F  in  0(1)  time.  The  total  preprocessing  time  to 
compute  red/black  frees  storing  find  maps  for  all  of  the  function  symbols  together  is  then 
0(1  log  /).  Using  Willard’s  data  structure  instead  takes  expected  time  0(1  log  /)  or  worst  case  time 
Of / 2 log  /),  because  it  depends  on  perfect  hashing[13]. 

Preprocessing  for  (6)  (ii.)  and  (iii.)  involves  computing  findwx  maps  over  sets  S  (x)  for  each 
xe II) .  We  compute  these  maps  according  to  a  preorder  search  through  II* .  Suppose  that  y  comes 
immediately  after  x  in  the  preordering  of  fl  j  .  Suppose  also  that  findwx  is  computed  for  set  S(x ). 
Our  goal  is  to  compute  findwy  for  set  S  (y)  by  performing  modifications  to  findwx.  It  suffices  to 
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consider  two  cases:  (1)  where  y  is  a  proper  descendant  of  x  in  T,  and  (2)  otherwise. 

If  y  is  a  proper  descendant  of  x,  then  S  (y )  =  S  (xjuA’fyj.  In  this  case  we  can  compute  findwy 
by  first  computing  the  find  map  local_find  for  f?{y]  using  Algorithm  Compute_find.  By  Lemma 
10  we  know  that  no  element  in  f?{y]  is  a  proper  ancestor  of  any  element  in  S  (x).  Hence,  for  each 
z  g  domain  local_find ,  if  local_find (z)  ±  1,  we  perform  the  update  findwfiz)  :=  |y,  local_fincl (z)], 
where  y  will  always  be  a  new  witness;  otherwise  if  local _find (z)  =  1,  we  perform  a  nearest  neigh¬ 
bor  query  a  =  max/{«  e  domain  findwx  I  u  <  z),  and  assign  findw fia)  to  findwfiz)-  The  map  that 
results  from  these  operations  is  findwy. 

If  we  assume  that  dummy  value  0  is  the  first  element  in  II  \  in  which  S  (0)  and  findw  0  are 
both  empty,  then  the  preceding  approach  for  case  (1)  can  be  used  to  compute  the  first  findw  map  in 
our  sequence.  To  handle  case  (2)  in  which  y  is  not  a  proper  descendant  of  x,  we  first  find  the 
closest  proper  ancestor  u  of  y  in  fl  j ,  where  dummy  value  0  is  regarded  as  a  proper  ancestor  of 
every  other  node.  Next,  we  update  findw x  to  form  a  copy  of  findw u.  Finally,  we  update  the  copy 
of  findw u  to  obtain  findw y  using  the  method  for  case  (1). 

More  specifically,  let  A  be  the  union  of  the  sets  ({pre(i):ieR{y}}  u 
{ p re  (i)+des  (i):ieR{y} })  for  all  y  coming  after  u  among  the  preordered  elements  of  II  *  such  that  y 
is  an  ancestor  of  x.  Then  for  each  ze  A,  if  it  belongs  to  the  domain  of  findwu,  assign  findwfiz)  to 
findw fiwfi,  otherwise,  remove  z  from  domain  findw x.  This  step  turns  findw x  into  a  copy  of  findwu. 
Map  findwy  is  obtained  by  further  modifying  findwx  according  to  the  method  for  case  (1). 

If  we  use  a  persistent  red/black  tree,  the  total  preprocessing  costs  to  compute  and  store  maps 
findw  for  function  /  are  0(1  f)  space  and  0(1  f log  If)  time.  The  cumulative  preprocessing  costs  to 
compute  these  maps  for  all  functions  /e  F  is  thus  0(1)  space  and  0(1  log  /)  time. 

Summing  up  the  preceding  discussion,  we  have 

THEOREM  12.  Bottom-Up  Step  (4)  can  be  computed  for  binary  Simple  Patterns  in  O(log  1) 
time  and  0(1)  auxiliary  space.  Total  preprocessing  costs  are  0(1 2)  time  and  space. 

The  reduction  of  Very  Simple  pattern  forests  PF  to  binary  form  introduces  0(\F\)  new  func¬ 
tion  symbols  and  0(kmax  l)  new  subpatterns.  The  cost  of  the  Bottom-Up  Step  is  approximately 
doubled,  while  the  theoretical  complexity  for  preprocessing  remains  unchanged. 

The  time  bound  for  Theorem  12  can  be  improved  to  O  ((loglog  l)2)  by  using  a  persistent  form 
of  the  Van  Ernde  Boas  queues  to  answer  queries  of  type  (6)  (ii.)  and  (iii.).  These  queues  can  be 
made  persistent  by  applying  the  results  of  Dietz[8].  Dietz’s  result  gives  as  an  immediate  corollary 
that  the  Van  Emde  Boas  structure  can  be  made  persistent  at  a  time  cost  of  a  factor  of  loglog  /  per 
operation.  The  time  for  lookups  is  worst-case;  the  preprocessing  time  (to  build  the  data  structure) 
is  expected,  because  it  depends  hashing[9, 13]  to  keep  the  space  down.  The  space  bound  remains 
0(1).  The  expected  preprocessing  time  is  0(1  (loglog  l)2). 
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9.  Conclusion 

We  believe  that  a  deeper  analysis  and  exploitation  of  the  structure  of  pattern  matching  can 
lead  to  further  algorithmic  improvements.  It  might  also  be  worthwhile  to  consider  hybrid  pattern 
matching  methods  that  combine  our  different  algorithms.  The  main  open  problem  in  the  method  of 
match  set  elimination  is  to  compute  the  subsumption  tree  T  in  better  time  and  space  than  Hoffmann 
and  O’Donnell’s  Algorithm  A.  Of  course,  this  method  would  also  benefit  from  improvements  in 
construction  time  for  persistent  Van  Ernde  Boas  priority  queues.  In  a  subsequent  paper  we  will 
report  how  to  extend  our  algorithms  to  a  more  complex  pattern  language,  which  is  used  to  perform 
semantic  analysis  within  RAPTS. 
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Appendix:  Pattern  Algebra 

Let  U  be  the  set  of  all  possible  patterns.  Let  >  be  the  more  general  than  relation  between  pat¬ 
terns.  The  relation  >  is  reflexive,  transitive  and  antisymmetric.  Thus  ( U ,  >)  is  a  partial  order.  It  is 
easy  to  see  that  any  subset  S  of  U  has  a  least  upper  bound  lub(S )  in  U.  Thus,  U  is  a  join  lattice  with 
v  being  the  maximum  element.  But  it  is  not  a  lattice. 

Two  patterns  in  U  are  said  to  be  compatible  if  they  have  a  lower  bound  in  U.  We  can  show 

that 

LEMMA  13.  A  finite  set  of  patterns  P  has  a  greatest  lower  bound  glb(P)  in  U  iff  these  pat¬ 
terns  are  mutually  compatible. 

Proof  The  only  if  part  is  trivial.  We  need  only  to  prove  the  if  part. 

Basis:  P  contains  at  least  one  leaf  x.  If  x  =  v,  then  gib  (P)  =  glb({x,  glb(P  -  {x})})  =  glb(P  - 
{x}).  If  x  is  a  constant,  then  gib  (P)  =  x. 

Induction:  Suppose  that  all  patterns  in  P  have  the  same  function  symbol  /  with  arity  k  >  0. 
Then  glb(P)  =  f(glb({x , :  fix  \ ,  ...,  xfi)  e  P}),  ...,  glb({xk:  f(xx,  ...,  xk)  e  P})).  ■ 

Let  PF  be  any  finite  subset  of  U.  A  subset  M  of  PF  is  called  a  match  set  (wrt  PF)  if  there  is  a 
pattern  t  in  U  such  that  M={xe  PF  I  x  >  t  }.  By  the  definition  of  match  set  and  compatibility  we 
have, 

LEMMA  14.  IfM  is  a  match  set,  then  the  patterns  in  M  are  mutually  compatible.  ■ 

LEMMA  15.  M  is  a  match  set  wrt  PF  iff  M  =  {  x  e  PF  I  x  >  glb(M)},  i.e,  iff  M  is  a  match  set 

of  glb(M)  wrt  PF. 

Proof  The  if  part  is  trivial.  Consider  the  only  if  part.  Since  M  is  a  match  set  wrt  PF,  then 
there  is  some  t  e  U  such  that  M  =  {  x  e  PF  I  x  >  t  }.  Since  glb{M)  >  t,  then  M={xeM\x> 
glb{M) )  c  {  x  6  PF  I  x  >  gib  ( M )  )c{xe  PF  I  x>t}=M.  ■ 

LEMMA  16.  IfM  \  and  M  2  are  two  match  sets  wrt  PF,  then  M\  n  Mi  is  also  a  match  set  wrt 
PF. 

Proof  Since  M\  and  Mi  are  two  match  sets  wrt  PF,  then  Mj  =  {  x  e  PF  I  x  >  gib  (Mi)} 
and  M2  =  {xe  PF  I  x  >  gib  (M2)}.  Therefore 

M{r)M2 

={x  e  PF  I  x  >  gib  (Mj)  and  x  >  gib  (. M2 )} 

={x  g  PF  I  x  >  lub  ({gib  (M ! ),  gib  (Mi)})}. 


which  is  a  match  set.  ■ 


